Merge branch 'release-5.0.0'

This commit is contained in:
Daniel Kraus
2017-09-01 05:40:54 +02:00
25 changed files with 2997 additions and 888 deletions

1
.atomignore Normal file
View File

@ -0,0 +1 @@
gh-pages/

447
Doxyfile

File diff suppressed because it is too large Load Diff

41
NEWS
View File

@ -1,11 +1,24 @@
Version 7.1.0 (2017-08-24)
Version 5.0.0 (2017-09-01)
------------------------------------------------------------------------
- Change: The $wgLinkTitlesBatchTimeLimit configuration variable was renamed to $wgLinkTitlesSpecialPageReloadAfter.
- Fix: Blacklist did not always work properly
- Fix: Contents of <noautolink> tags are now properly parsed as Wiki text.
- Fix: Links to other namespaces were not prefixed properly.
- Fix: The firstOnly option finally also works if a page contains a link to a given other page that was not currently added by the extension, i.e. that existed prior to an edit or that was manually added.
- Fix: When $wgCapitalLinks was true, the extension would not work with non-latin languages.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Version 4.1.0 (2017-08-25)
------------------------------------------------------------------------
- Fix: Properly handle templates that include other templates.
- New: Mark sections that are not to be automatically linked with the new `<noautolinks>..</noautolinks>` tag.
- New: Mark sections that are to be automatically linked with the new `<autolinks>..</autolinks>` tag. This tag only makes sense on pages with the `__NOAUTOLINKS__` magic word, or if both `$wgLinkTitlesParseOnEdit` and `$wgLinkTitlesParseOnRender` are set to false. Note that this tag is parsed when a page is rendered, not when it is saved. Therefore, the links will not appear in the page source.
- Fix: Properly handle templates that include other templates.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Version 4.0.9 (2017-03-21)
@ -13,7 +26,7 @@ Version 4.0.9 (2017-03-21)
- Fix: __NOAUTOLINKS__ was not respected during rendering.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Version 4.0.8 (2017-02-16)
@ -23,7 +36,7 @@ Version 4.0.8 (2017-02-16)
- Fix: The special page and the maintenance script did not work in MW 1.28.
- Fix: The special page did not work.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Version 4.0.7 (2017-01-02)
@ -31,7 +44,7 @@ Version 4.0.7 (2017-01-02)
- Improvement: Increase performance of special page and maintenance script.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Version 4.0.6 (2016-12-28)
@ -40,7 +53,7 @@ Version 4.0.6 (2016-12-28)
- Fix: Bug fixes.
- Fix: Custom namespace weights were not respected.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Version 4.0.5 (2016-12-14)
@ -50,7 +63,7 @@ Version 4.0.5 (2016-12-14)
- Fix: Remove leftover error log call.
- Improvement: Refactored maintenance script, improving user interaction.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Version 4.0.4 (2016-11-30)
@ -58,7 +71,7 @@ Version 4.0.4 (2016-11-30)
- Fix: Do not link titles twice if $wgLinkTitlesFirstOnly and $wgLinkTitlesSmartMode are both true.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Version 4.0.3 (2016-11-22)
@ -66,7 +79,7 @@ Version 4.0.3 (2016-11-22)
- Fix: __NOAUTOLINKS__ magic word would not be respected when saving an edited page.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Version 4.0.2 (2016-11-09)
@ -75,7 +88,7 @@ Version 4.0.2 (2016-11-09)
- FIX: Removed a fatal bug in the LinkTitles_Maintenance script.
- FIX: Repaired severely broken namespaces support.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Version 4.0.1 (2016-11-08)
@ -83,7 +96,7 @@ Version 4.0.1 (2016-11-08)
- FIX: Prevent syntax error when accessing special page.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Version 4.0.0 (2016-11-05)
@ -95,7 +108,7 @@ Version 4.0.0 (2016-11-05)
- NEW: Support namespaces.
- NEW: Use the new extension format introduced by MediaWiki 1.25; the extension will no longer run with older MediaWiki versions.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Version 3.1.0. (2015-02-05)
@ -103,7 +116,7 @@ Version 3.1.0. (2015-02-05)
- IMPROVEMENT: Do not link inside <file>...</file> tags.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

531
README.md
View File

@ -2,21 +2,435 @@
LinkTitles
==========
MediaWiki extension that automatically adds links to words that match titles of existing pages.
[MediaWiki extension](https://www.mediawiki.org/wiki/Extension:LinkTitles) that
automatically adds links to words that match titles of existing pages.
For more information, see http://www.mediawiki.org/wiki/Extension:LinkTitles
Minimum requirements: MediaWiki 1.25, PHP 5.3
Source code documentation can be found at the [Github project
pages](http://bovender.github.io/LinkTitles).
This extension is [semantically versioned](http://semver.org).
Minimum requirements: MediaWiki 1.25, PHP 5.3. Source code documentation can be
found at the [Github project pages](http://bovender.github.io/LinkTitles).
Contributing
Table of contents
-----------------
1. [Oveview](#overview)
- [Versions](#versions)
2. [Installation](#installation)
3. [Usage](#usage)
- [Editing a page](#editing-a-page)
- [Preventing automatic linking after minor edits](#preventing-automatic-linking-after-minor-edits)
- [Viewing a page](#viewing-a-page)
- [Including and excluding pages with Magic Words](#including-and-excluding-pages-with-magic-words)
- [Enable or disable automatic linking for sections](#enable-or-disable-automatic-linking-for-sections)
- [Namespace support](#namespace-support)
- [Batch processing](#batch-processing)
- [Special:LinkTitles](#special-linktitles)
- [Maintenance script](#maintenance-script)
4. [Configuration](#configuration)
- [Linking when a page is edited and saved](#linking-when-a-page-is-edited-and-saved)
- [Linking when a page is rendered for display](#linking-when-a-page-is-rendered-for-display)
- [Enabling case-insensitive linking (smart mode)](#enabling-case-insensitive-linking-(smart-mode))
- [Dealing with custom namespaces](#dealing-with-custom-namespaces)
- [Linking or skipping headings](#linking-or-skipping-headings)
- [Prioritizing pages with short titles](#prioritizing-pages-with-short-titles)
- [Filtering pages by title length](#filtering-pages-by-title-length)
- [Excluding pages from being linked to](#excluding-pages-from-being-linked-to)
- [Dealing with templates](#dealing-with-templates)
- [Multiple links to the same page](#multiple-links-to-the-same-page)
- [Partial words](#partial-words)
- [Special page configuration](#special-page-configuration)
5. [Development](#development)
- [Contributors](#contributors)
- [Testing](#testing)
6. [License](#license)
Overview
--------
The **LinkTitles** extension automatically adds links to existing page titles
that occur on a given page. This will automatically cross-reference your wiki
for you. The extension can operate in three ways that can be used independently:
1. Whenever a page is edited and saved, the extension will look if any existing
page titles occur in the text, and automatically add links (`[[...]]]`) to the
corresponding pages.
2. Links may also be added on the fly whenever a page is rendered for display.
Most of the time, MediaWiki will fetch previously rendered pages from cache upon
a page request, but whenever a page is refreshed, the LinkTitles extension can
add its page links. These links are not hard-coded in the Wiki text. The
original content will not be modified.
3. Batch mode enables Wiki administrators to process all pages in a Wiki at
once. Batch processing can either be started from a special page, or from the
server's command line (see [below](#Batch_processing "wikilink")).
### Versions
This extension is [semantically versioned](http://semver.org). In short, this
means that the first version number (the 'major') only changes on substantial
changes. The second number (the 'minor') changes when features are added or
significantly improved. The third number (the 'patch level') changes when bugs
are fixed.
Version | Date | Major changes ||
-|-|-|-
5 | 09-2017 | Rewrote the entire extension; vastly improved namespace support; some breaking changes | [Details][v5.0.0]
4 | 11-2016 | Changed format of the extension for MediaWiki version 1.25; added basic namespace support | [Details][v4.0.0]
3 | 02-2015 | Added magic words; improved performance | [Details][3.0.0]
2 | 11-2013 | Introduced smart mode | [Details][2.0.0]
1 | 05-2012 | First stable release |
[v5.0.0]: https://github.com/bovender/LinkTitles/releases/tag/v5.0.0
[v4.0.0]: https://github.com/bovender/LinkTitles/releases/tag/v4.0.0
[3.0.0]: https://github.com/bovender/LinkTitles/compare/2.4.1...3.0.0
[2.0.0]: https://github.com/bovender/LinkTitles/compare/1.8.1...2.0.0
For more details, click the 'Details' links, see the `NEWS` file in the
repository for a user-friendly changelog, or study the commit messages.
Installation
------------
To obtain the extension, you can either download a compressed archive from the
[Github releases page](https://github.com/bovender/LinkTitles/releases): Choose
one of the 'Source code' archives and extract it in your Wiki's `extension`
folder. Note that these archives contain a folder that is named after the
release version, e.g. `LinkTitles-5.0.0`. You may want to rename the folder to
`LinkTitles`.
Alternatively (and preferred by the author), if you have [Git](https://git-scm.com),
you can pull the repository in the `extensions` folder.
To activate the extension, add the following to your `LocalSettings.php` file:
wfLoadExtension( 'LinkTitles' );
Do not forget to adjust the [configuration](#configuration) to your needs.
If your MediaWiki version is really old (1.24 and older), you need to use
a [different mechanism](https://www.mediawiki.org/wiki/Manual:Extensions#Installing_an_extension).
Usage
-----
### Editing a page
By default, the LinkTitles extension will add links to existing pages whenever
you edit and save a page. Unless you changed the configuration variables, it will
link whole words only, prefer longer target page titles over shorter ones, skip
headings, and add multiple links if a page title appears more than once on the
page. All of this is configurable; see the [Configuration](#configuration)
section.
### Preventing automatic linking after minor edits
If the 'minor edit' check box is marked when you save a page, the extension will
not operate.
### Viewing a page
If you do not want the LinkTitles extension to modify the page sources, you can
also have links added whenever a page is being viewed (or, technically, when it
is being rendered). MediaWiki caches rendered pages. Therefore, links do not need
to be added every time a page is being viewed. See the
[`$wgLinkTitlesParseOnRender`](#linking-when-a-page-is-rendered-for-display)
configuration variable.
### Including and excluding pages with Magic Words
Add the magic word `__NOAUTOLINKS__` to a page to prevent automatic linking of
page titles.
The presence of `__NOAUTOLINKTARGET__` prevents a page from being automatically
linked to from other pages.
### Enable or disable automatic linking for sections
To **exclude** a section on your page from automatic linking, wrap it in
`<noautolinks>...</noautolinks>` tags.
To **include** a section on your page for automatic linking, wrap it in
`<autolinks>...</autolinks>` tags. Of course this only makes sense if both
`$wgLinkTitlesParseOnEdit` and `$wgLinkTitlesParseOnRender` are set to `false`
**or** if the page contains the `__NOAUTOLINKS__` magic word.
### Namespace support
By default, LinkTitles will only process pages in the `NS_MAIN` namespace (i.e.,
'normal' Wiki pages). You can have modify the configuration to process pages in
other 'source' namespaces as well. By default, LinkTitles will only link to pages
that are in the same namespace as the page being edited or viewed. Again, additional
'target' namespaces may be added in the [configuration](#dealing-with-custom-namespaces).
If a page contains another page's title that is prefixed with the namespace
(e.g. `my_namspace:other page`), LinkTitles will _not_ add a link. It is assumed
that if someone deliberately types a namespace-qualified page title, they might
just as well add the link markup (`[[...]]`) as well. It is the LinkTitles
extension's intention to facilitate writing non-technical texts and have links
to existing pages added automatically.
### Batch processing
The extension provides two methods to batch-process all pages in a Wiki: A
special page (i.e., graphical user interface) and a command-line maintenance
script.
#### Special:LinkTitles
The special page provides a simple web interface to trigger batch processing. To
avoid blocking the web server for too long, the page will frequently reload
itself (this can be controlled by the `$wgLinkTitlesSpecialPageReloadAfter`
configuration variable that sysops can set in the `LocalSettings.php` file).
For security reasons, by default only users in the 'sysop' group are allowed to
view the special page (otherwise unauthorized people could trigger a parsing of
your entire wiki). To allow other user groups to view the page as well, add a
line `$wgGroupPermissions ['`<groupname>`']['linktitles-batch']` `=` `true` to
`LocalSettings.php`.
#### Maintenance script
If you have access to a shell on the server that runs your wiki, and are allowed
to execute `/bin/php` on the command line, you can use the extension's
maintenance script. Unlike MediaWiki's built-in maintenance scripts, this
resides not in the `maintenance/` subdirectory but in the extension's own
directory (the one where you downloaded and extracted the files to).
To trigger parsing of all pages, issue:
php linktitles-cli.php
You can interrupt the process by hitting `CTRL+C` at any time.
To continue parsing at a later time, make a note of the index number of the last
page that was processed (e.g., 37), and use the maintenance script with the
`--start` option (or short `-s`) to indicate the start index:
php LinkTitles.cli.php -s 37
See all available options with:
php LinkTitles.cli.php -h
Configuration
--------------
To change the configuration, set the variables in your `LocalSettings.php` file.
The code lines below show the default values of the configuration variables.
### Linking when a page is edited and saved
$wgLinkTitlesParseOnEdit = true;
Parse page content whenever it is edited and saved, unless 'minor edit' box is
checked. This is the default mode of operation. It has the disadvantage that
newly created pages won't be linked to from existing pages until those existing
pages are edited and saved.
### Linking when a page is rendered for display
$wgLinkTitlesParseOnRender = false;
Parse page content when it is rendered for viewing. Unlike the "parse on edit"
mode of operation, this will *not* hard-code the links in the Wiki text. Thus,
if you edit a page that had links added to it during rendering, you will not see
the links in the Wiki markup.
Note that MediaWiki caches rendered pages in the database, so that pages rarely
need to be rendered. Rendering is whenever a page is viewed and saved.
Therefore, whether you want to enable both parse-on-edit and parse-on-render
depends on whether you want to have links (`[[...]]`) added to the Wiki markup.
Please note that the extension will work on a fully built page when this mode is
enabled; therefore, it *will* add links to text transcluded from templates,
regardless of the configuration setting of `LinkTitlesSkipTemplages`.
You can purge the page cache and trigger rendering by adding `?action=purge` to
the URL.
### Enabling case-insensitive linking (smart mode)
$wgLinkTitlesSmartMode = true;
With smart mode enabled, the extension will first perform a case-sensitive
search for page titles in the current page; then it will search for occurrences
of the page titles in a case-insensitive way and add aliased ('piped') links.
Thus, if you have a page `MediaWiki Extensions`, but write `Mediawiki
extensions` (with a small 'e') in your text, LinkTitles would generate a link
`[[MediaWiki Extensions|Mediawiki extensions]]`, obviating the need to add
dummy pages for variants of page titles with different cases.
Smart mode is enabled by default. You can disable it to increase performance of
the extension.
### Dealing with custom namespaces
$wgLinkTitlesSourceNamespace = [];
Specifies additional namespaces for pages that should be processed by the
LinkTitles extension. If this is an empty array (or anything else that PHP
evaluates to `false`), the default namespace `NS_MAIN` will be assumed.
The values in this array must be numbers/namespace constants (`NS_xxx`).
$wgLinkTitlesTargetNamespaces = [];
By default, only pages in the same namespace as the page being edited or viewed
will be considered as link targets. If you want to link to pages in other
namespaces, list them here. Note that the source page's own namespace will also
be included, unless you change the `$wgLinkTitlesSamenamespace` option.
The values in this array must be numbers/namespace constants (`NS_xxx`).
$wgLinkTitlesSamenamespace = true;
If you do not want to have a page's own namespace included in the possible
target namespaces, set this to false. Of course, if `$wgLinkTitlesSameNamespace`
is `false` and `$wgLinkTitlesTargetNamespaces` is empty, LinkTitle will add
no links at all because there are no target namespaces at all.
#### Example: Default configuration
$wgLinkTitlesSourceNamespace = [];
$wgLinkTitlesTargetNamespaces = [];
$wgLinkTitlesSamenamespace = true;
Process pages in the `NS_MAIN` namespace only, and add links to the `NS_MAIN`
namespace only (i.e., the same namespace that the source page is in).
#### Example: Custom namespace only
$wgLinkTitlesSourceNamespace = [ NS_MY_NAMESPACE];
$wgLinkTitlesTargetNamespaces = [];
$wgLinkTitlesSamenamespace = true;
Process pages in the `NS_MY_NAMESPACE` namespace only, and add links to the
`NS_MY_NAMESPACE` namespace only (i.e., the same namespace that the source page
is in).
#### Example: Link to `NS_MAIN` only
$wgLinkTitlesSourceNamespace = [ NS_MY_NAMESPACE];
$wgLinkTitlesTargetNamespaces = [ NS_MAIN ];
$wgLinkTitlesSamenamespace = false;
Process pages in the `NS_MY_NAMESPACE` namespace only, and add links to the
`NS_MAIN` namespace only. Do not link to pages that are in the same namespace
as the source namespace (i.e., `NS_MY_NAMESPACE`).
### Linking or skipping headings
$wgLinkTitlesParseHeadings = false;
Determines whether or not to add links to headings. By default, the extension
will leave your (sub)headings untouched. Only applies to parse-on-edit!
There is a **known issue** that the extension regards incorrectly formatted
headings as headings. Consider this line:
## incorrect heading #
This line is not recognized as a heading by MediaWiki because the pound signs
(`#`) are not balanced. However, the LinkTitles extension will currently treat
this line as a heading (if it starts and ends with pound signs).
### Prioritizing pages with short titles
$wgLinkTitlesPreferShortTitles = false;
If `$wgLinkTitlesPreferShortTitles` is set to `true`, parsing will begin with
shorter page titles. By default, the extension will attempt to link the longest
page titles first, as these generally tend to be more specific.
### Filtering pages by title length
$wgLinkTitlesMinimumTitleLength = 3;
Only link to page titles that have a certain minimum length. In my experience,
very short titles can be ambiguous. For example, "mg" may be "milligrams" on a
page, but there may be a page title "Mg" which redirects to the page
"Magnesium". This settings prevents erroneous linking to very short titles by
setting a minimum length. You can adjust this setting to your liking.
### Excluding pages from being linked to
$wgLinkTitlesBlackList = [];
Exclude page titles in the array from automatic linking. You can populate this
array with common words that happen to be page titles in your Wiki. For example,
if for whatever reason you had a page "And" in your Wiki, every occurrence of
the word "and" would be linked to this page.
To add page titles to the black list, you can use statements such as
$wgLinkTitlesBlackList[] = 'Some special page title';
in your `LocalSettings.php` file. Use one of these for every page title that you want to
put on the black list. Alternatively, you can specify the entire array:
$wgLinkTitlesBlackList = [ 'Some special page title', 'Another one' ];
Keep in mind that a MediaWiki page title always starts with a capital letter
unless you have `$wgCapitalLinks = false;` in your `LocalSettings.php`.
**Therefore, if you have lowercase first letters in the black list array, they
will have no effect.**
### Dealing with templates
$wgLinkTitlesSkipTemplates = false;
If set to true, do not parse the variable text of templates, i.e. in `{{my`
`template|some` `variable=some` `content}}`, leave the entire text between the
curly brackets untouched. If set to false (default setting), the text after the
pipe symbole ("|") will be parsed.
Note: This setting works only with parse-on-edit; it does not affect
parse-on-render!
### Multiple links to the same page
$wgLinkTitlesFirstOnly = false;
If set to true, only link the first occurrence of a title on a given page. If
a link is piped, i.e. hiding the title of the target page:
[[target page|text that appears as link text]]
then the LinkTitles extension does not count that as an occurrence.
### Partial words
$wgLinkTitlesWordStartOnly = true;
$wgLinkTitlesWordEndOnly = true;
Restrict linking to occurrences of the page titles at the start of a word. If
you want to have only the exact page titles linked, you need to set **both**
options `$wgLinkTitlesWordStartOnly` and `$wgLinkTitlesWordEndOnly` to *true*.
On the other hand, if you want to have all occurrences of a page title linked,
even if they are in the middle of a word, you need to set both options to
*false*.
Keep in mind that linking in MediaWiki is generally *case-sensitive*.
### Special page configuration
$wgLinkTitlesSpecialPageReloadAfter = 1; // seconds
The `LinkTitles:Special` page performs batch processing of pages by repeatedly
calling itself. This happens to prevent timeouts on your server. The default
reload interval is 1 second.
Development
-----------
If you wish to contribute, please issue pull requests against the `develop`
branch, as I follow Vincent Driessen's advice on [A successful Git branching
model](http://nvie.com/git-model) (knowing that there are [alternative
@ -26,9 +440,96 @@ The `master` branch contains stable releases only, so it is safe to pull the
master branch if you want to install the extension for your own wiki.
Contributors
------------
### Contributors
- Daniel Kraus (@bovender), main developer
- Ulrich Strauss (@c0nnex), namespaces
- Brent Laabs (@labster), code review and bug fixes
- Daniel Kraus (@bovender), main developer
- Ulrich Strauss (@c0nnex), namespaces
- Brent Laabs (@labster), code review and bug fixes
- @tetsuya-zama, bug fix
- @yoshida, namespace-related bug fixes
### Testing
Starting from version 5, LinkTitles finally comes with phpunit tests. The code
is not 100% covered yet. If you find something does not work as expected, let me
know and I will try to add unit tests and fix it.
Here's how I set up the testing environment. This may not be the canonical way
to do it. Basic information on testing MediaWiki can be found
[here](https://www.mediawiki.org/wiki/Manual:PHP_unit_testing).
The following assumes that you have an instance of MediaWiki running locally on
your development machine. This assumes that you are running Linux (I personally
use Ubuntu).
1. Pull the MediaWiki repository:
cd ~/Code
git clone --depth 1 https://phabricator.wikimedia.org/source/mediawiki.git
2. Install [composer](https://getcomposer.org) locally and fetch the
dependencies (including development dependencies):
Follow the instructions on the [composer download page](https://getcomposer.org/download),
but instead of running `php composer-setup.php`, run:
php composer-setup.php --install-dir=bin --filename=composer
bin/composer install
3. Install phpunit (it was already installed on my Ubuntu system when I began
testing LinkTitles, so I leave it up to you to figure out how to do it).
4. Copy your `LocalSettings.php` over from your local MediaWiki installation
and remove (or comment out) any lines that reference extensions or skins that
you are not going to install to your test environment. For the purposes of
testing the LinkTitles extension, leave the following line in place:
wfLoadExtensions( array( 'LinkTitles' ));
And ensure the settings file contains the following:
$wgShowDBErrorBacktrace = true;
5. Create a symbolic link to your copy of the LinkTitles repository:
cd ~/Code/mediawiki/extensions
ln -s ~/Code/LinkTitles
6. Make sure your local MediaWiki instance is up to date. Otherwise phpunit may
fail and tell you about database problems.
This is because the local database is used as a template for the unit tests.
For example, I initially had MW 1.26 installed on my laptop, but the cloned
repository was MW 1.29.1. It's probably also possible to clone the repository
with a specific version tag which matches your local installation.
7. Run the tests:
cd ~/Code/mediawiki/tests/phpunit
php phpunit.php --group bovender
This will run all tests from the 'bovender' group, i.e. tests for my extensions.
If you linked just the LinkTitles extension in step 5, only this extension
will be tested.
License
-------
Copyright 2012-2017 Daniel Kraus <mailto:bovender@bovender.de> (@bovender)
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
MA 02110-1301, USA.

View File

@ -9,16 +9,10 @@
This is the [source code][] documentation for the [LinkTitles][] extension
for [MediaWiki][].
The central class is LinkTitles, which contains only static functions. If
you are looking for the linking algorithm, inspect the
LinkTitles\\Extension::parseContent() function.
The extension provides two methods for batch-processing of pages. One is a
@link LinkTitles\\Special special page @endlink that provides web-access (by
default restricted to sysops). The other is a @link LinkTitles\\Cli
maintenance script @endlink that can be called from the command line if you
have access to your server and are authorized to run php from the command
line.
With version 5.0, the code is more extensively commented than ever. Version 5
brought a major refactoring, and the extension now consists of several classes
with clearly defined concerns. Look at the class comments to find out what they
do.
@note The source code that is referenced in this documentation may not
necessarily reflect the latest code in the repository! Make sure to check

View File

@ -3,11 +3,13 @@
"author": [
"[https://www.mediawiki.org/wiki/User:Bovender Daniel Kraus (bovender)]",
"Ulrich Strauss (c0nnex)",
"Brent Laabs (labster)"
"Brent Laabs (labster)",
"tetsuya-zama",
"yoshida"
],
"type": "parserhook",
"url": "https://www.mediawiki.org/wiki/Extension:LinkTitles",
"version": "4.1.0",
"version": "5.0.0",
"license-name": "GPL-2.0+",
"descriptionmsg": "linktitles-desc",
"requires": {
@ -27,14 +29,21 @@
"LinkTitlesSmartMode": true,
"LinkTitlesWordStartOnly": true,
"LinkTitlesWordEndOnly": true,
"LinkTitlesBatchTimeLimit": 1,
"LinkTitlesNamespaces": [
0
]
"LinkTitlesSpecialPageReloadAfter": 1,
"LinkTitlesSourceNamespaces": [],
"LinkTitlesTargetNamespaces": [],
"LinkTitlesSameNamespace": true
},
"AutoloadClasses": {
"LinkTitles\\Extension": "includes/LinkTitles_Extension.php",
"LinkTitles\\Special": "includes/LinkTitles_Special.php"
"LinkTitles\\Extension": "includes/Extension.php",
"LinkTitles\\Linker": "includes/Linker.php",
"LinkTitles\\Source": "includes/Source.php",
"LinkTitles\\Target": "includes/Target.php",
"LinkTitles\\Targets": "includes/Targets.php",
"LinkTitles\\Splitter": "includes/Splitter.php",
"LinkTitles\\Config": "includes/Config.php",
"LinkTitles\\Special": "includes/Special.php",
"LinkTitles\\TestCase": "tests/phpunit/TestCase.php"
},
"SpecialPages": {
"LinkTitles": "LinkTitles\\Special"
@ -61,9 +70,8 @@
"LinkTitles\\Extension::onParserFirstCallInit"
]
},
"callback": "LinkTitles\\Extension::setup",
"ExtensionMessagesFiles": {
"LinkTitlesMagic": "includes/LinkTitles_Magic.php"
"LinkTitlesMagic": "includes/Magic.php"
},
"MessagesDirs": {
"LinkTitles": [

221
includes/Config.php Normal file
View File

@ -0,0 +1,221 @@
<?php
/**
* The LinkTitles\Config class holds configuration for the LinkTitles extension.
*
* Copyright 2012-2017 Daniel Kraus <bovender@bovender.de> ('bovender')
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*
* @author Daniel Kraus <bovender@bovender.de>
*/
namespace LinkTitles;
/**
* Holds LinkTitles configuration.
*
* This class encapsulates the global configuration variables so we do not have
* to pull those globals into scope in the individual LinkTitles classes.
*
* Using a dedicated configuration class also facilitates overriding certain
* options, i.e. in a maintenance script that is invoked with flags from the
* command line.
*
* @since 5.0.0
*/
class Config {
/**
* Whether to add links to a page when the page is edited/saved.
* @var bool $parseOnEdit
*/
public $parseOnEdit;
/**
* Whether to add links to a page when the page is rendered.
* @var bool $parseOnRender
*/
public $parseOnRender;
/**
* Indicates whether to prioritize short over long titles.
* @var bool $preferShortTitles
*/
public $preferShortTitles;
/**
* Minimum length of a page title for it to qualify as a potential link target.
* @var int $minimumTitleLength
*/
public $minimumTitleLength;
/**
* Array of page titles that must never be link targets.
*
* This may be useful to exclude common abbreviations or acronyms from
* automatic linking.
* @var Array $blackList
*/
public $blackList;
/**
* Array of those namespaces (integer constants) whose pages may be linked
* when edited.
* @var Array $sourceNamespaces
*/
public $sourceNamespaces;
/**
* Array of those namespaces (integer constants) whose pages may be linked
* to a source page.
* @var Array $targetNamespaces
*/
public $targetNamespaces;
/**
* Indicates whether to add a link to the first occurrence of a page title
* only (true), or add links to all occurrences on the source page (false).
* @var bool $firstOnly;
*/
public $firstOnly;
/**
* Indicates whether to operate in smart mode, i.e. link to pages even if the
* case does not match. Without smart mode, pages are linked to only if the
* exact title appears on the source page.
* @var bool $smartMode;
*/
public $smartMode;
/**
* Mirrors the global MediaWiki variable $wgCapitalLinks that indicates
* whether or not page titles are fully case sensitive
* @var bool $capitalLinks;
*/
public $capitalLinks;
/**
* Whether or not to link to pages only if the page title appears at the
* start of a word on the target page (i.e., link 'MediaWiki' to a page
* 'Media', but not to a page 'Wiki').
*
* Set both $wordStartOnly and $wordEndOnly to true to enforce matching
* whole titles.
*
* @var bool $wordStartOnly;
*/
public $wordStartOnly;
/**
* Whether or not to link to pages only if the page title appears at the
* end of a word on the target page (i.e., link 'MediaWiki' to a page
* 'Wiki', but not to a page 'Media').
*
* Set both $wordStartOnly and $wordEndOnly to true to enforce matching
* whole titles.
*
* @var bool $wordEndOnly;
*/
public $wordEndOnly;
/**
* Whether or not to skip templates. If set to true, text inside transclusions
* will not be linked.
* @var bool $skipTemplates
*/
public $skipTemplates;
/**
* Whether or not to parse headings.
* @var bool $parseHeadings
*/
public $parseHeadings;
/**
* Whether to check if a potential target page links back to the source page.
* Set this to true to avoid indirect linkbacks.
*
* @var bool $checkRedirect
*/
public $checkRedirect;
/**
* Whether to enable the __NOAUTOLINKTARGET__ magic word which prevents
* a potential target page from being linked to.
*
* @var bool $enableNoTargetMagicWord
*/
public $enableNoTargetMagicWord;
/**
* Time (in seconds) after which to reload the special page.
* @var integer reload interval (in seconds)
*/
public $specialPageReloadAfter;
/**
* Whether to link to pages in the same namespace (default is true).
* @var bool $sameNamespace;
*/
public $sameNamespace;
public $enableConsoleOutput;
public $enableDebugConsoleOutput;
/**
* Constructs a new Config object.
*
* The object's member variables will automatically be set with the values
* from the corresponding global variables.
*/
public function __construct() {
global $wgLinkTitlesParseOnEdit;
global $wgLinkTitlesParseOnRender;
global $wgLinkTitlesPreferShortTitles;
global $wgLinkTitlesMinimumTitleLength;
global $wgLinkTitlesBlackList;
global $wgLinkTitlesSourceNamespaces;
global $wgLinkTitlesTargetNamespaces;
global $wgLinkTitlesSameNamespace;
global $wgLinkTitlesFirstOnly;
global $wgLinkTitlesSmartMode;
global $wgCapitalLinks;
global $wgLinkTitlesWordStartOnly;
global $wgLinkTitlesWordEndOnly;
global $wgLinkTitlesSkipTemplates;
global $wgLinkTitlesParseHeadings;
global $wgLinkTitlesEnableNoTargetMagicWord;
global $wgLinkTitlesCheckRedirect;
global $wgLinkTitlesSpecialPageReloadAfter;
$this->parseOnEdit = $wgLinkTitlesParseOnEdit;
$this->parseOnRender = $wgLinkTitlesParseOnRender;
$this->preferShortTitles = $wgLinkTitlesPreferShortTitles;
$this->minimumTitleLength = $wgLinkTitlesMinimumTitleLength;
$this->blackList = $wgLinkTitlesBlackList;
$this->sourceNamespaces = $wgLinkTitlesSourceNamespaces ? $wgLinkTitlesSourceNamespaces : [ NS_MAIN ];
$this->targetNamespaces = $wgLinkTitlesTargetNamespaces;
$this->sameNamespace = $wgLinkTitlesSameNamespace;
$this->firstOnly = $wgLinkTitlesFirstOnly;
$this->smartMode = $wgLinkTitlesSmartMode;
$this->capitalLinks = $wgCapitalLinks; // MediaWiki global variable
$this->wordStartOnly = $wgLinkTitlesWordStartOnly;
$this->wordEndOnly = $wgLinkTitlesWordEndOnly;
$this->skipTemplates = $wgLinkTitlesSkipTemplates;
$this->parseHeadings = $wgLinkTitlesParseHeadings;
$this->enableNoTargetMagicWord = $wgLinkTitlesEnableNoTargetMagicWord;;
$this->checkRedirect = $wgLinkTitlesCheckRedirect;;
$this->specialPageReloadAfter = $wgLinkTitlesSpecialPageReloadAfter;
$this->enableConsoleOutput = false;
$this->enableDebugConsoleOutput = false;
}
}

149
includes/Extension.php Normal file
View File

@ -0,0 +1,149 @@
<?php
/**
* The LinkTitles\Extension class provides event handlers and entry points for the extension.
*
* Copyright 2012-2017 Daniel Kraus <bovender@bovender.de> ('bovender')
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*
* @author Daniel Kraus <bovender@bovender.de>
*/
namespace LinkTitles;
/**
* Provides event handlers and entry points for the extension.
*/
class Extension {
/**
* Event handler for the PageContentSave hook.
*
* This handler is used if the parseOnEdit configuration option is set.
*/
public static function onPageContentSave( &$wikiPage, &$user, &$content, &$summary,
$isMinor, $isWatch, $section, &$flags, &$status ) {
$config = new Config();
if ( !$config->parseOnEdit || $isMinor ) return true;
$source = Source::createFromPageandContent( $wikiPage, $content, $config );
$linker = new Linker( $config );
$result = $linker->linkContent( $source );
if ( $result ) {
$content = $source->setText( $result );
}
return true;
}
/*
* Event handler for the InternalParseBeforeLinks hook.
*
* This handler is used if the parseOnRender configuration option is set.
*/
public static function onInternalParseBeforeLinks( \Parser &$parser, &$text ) {
$config = new Config();
if ( !$config->parseOnRender ) return true;
$title = $parser->getTitle();
$source = Source::createFromParserAndText( $parser, $text, $config );
$linker = new Linker( $config );
$result = $linker->linkContent( $source );
if ( $result ) {
$text = $result;
}
return true;
}
/**
* Adds links to a single page.
*
* Entry point for the SpecialLinkTitles class and the LinkTitlesJob class.
*
* @param \Title $title Title object.
* @param \RequestContext $context Current request context. If in doubt, call MediaWiki's `RequestContext::getMain()` to obtain such an object.
* @return bool True if the page exists, false if the page does not exist
*/
public static function processPage( \Title $title, \RequestContext $context ) {
$config = new Config();
$source = Source::createFromTitle( $title, $config );
if ( $source->hasContent() ) {
$linker = new Linker( $config );
$result = $linker->linkContent( $source );
if ( $result ) {
$content = $source->getContent()->getContentHandler()->unserializeContent( $result );
$source->getPage()->doEditContent(
$content,
"Links to existing pages added by LinkTitles bot.", // TODO: i18n
EDIT_MINOR | EDIT_FORCE_BOT,
false, // baseRevId
$context->getUser()
);
};
return true;
}
else {
return false;
}
}
/*
* Adds the two magic words defined by this extension to the list of
* 'double-underscore' terms that are automatically removed before a
* page is displayed.
*
* @param Array $doubleUnderscoreIDs Array of magic word IDs.
* @return true
*/
public static function onGetDoubleUnderscoreIDs( array &$doubleUnderscoreIDs ) {
$doubleUnderscoreIDs[] = 'MAG_LINKTITLES_NOTARGET';
$doubleUnderscoreIDs[] = 'MAG_LINKTITLES_NOAUTOLINKS';
return true;
}
/**
* Handles the ParserFirstCallInit hook and adds the <autolink>/</noautolink>
* tags.
*/
public static function onParserFirstCallInit( \Parser $parser ) {
$parser->setHook( 'noautolinks', 'LinkTitles\Extension::doNoautolinksTag' );
$parser->setHook( 'autolinks', 'LinkTitles\Extension::doAutolinksTag' );
}
/*
* Removes the extra tag that this extension provides (<noautolinks>)
* by simply returning the text between the tags (if any).
* See https://www.mediawiki.org/wiki/Manual:Tag_extensions#Example
*/
public static function doNoautolinksTag( $input, array $args, \Parser $parser, \PPFrame $frame ) {
return $parser->recursiveTagParse( $input, $frame );
}
/*
* Removes the extra tag that this extension provides (<noautolinks>)
* by simply returning the text between the tags (if any).
* See https://www.mediawiki.org/wiki/Manual:Tag_extensions#How_do_I_render_wikitext_in_my_extension.3F
*/
public static function doAutolinksTag( $input, array $args, \Parser $parser, \PPFrame $frame ) {
$config = new Config();
$linker = new Linker( $config );
$source = Source::createFromParser( $parser, $config );
$result = $linker->linkContent( $source );
if ( $result ) {
return $parser->recursiveTagParse( $result, $frame );
} else {
return $parser->recursiveTagParse( $input, $frame );
}
}
}
// vim: ts=2:sw=2:noet:comments^=\:///

View File

@ -1,527 +0,0 @@
<?php
/*
* Copyright 2012-2017 Daniel Kraus <bovender@bovender.de> ('bovender')
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*/
/// @file
namespace LinkTitles;
/// Helper function for development and debugging.
/// @param $var Any variable. Raw content will be dumped to stderr.
/// @return undefined
function dump($var) {
error_log(print_r($var, TRUE) . "\n", 3, 'php://stderr');
};
/// Central class of the extension. Sets up parser hooks.
/// This class contains only static functions; do not instantiate.
class Extension {
/// Caching variable for page titles that are fetched from the DB.
private static $pageTitles;
/// Caching variable for the current namespace.
/// This is needed because the sort order of the page titles that
/// are cached in self::$pageTitles depends on the namespace of
/// the page currently being processed.
private static $currentNamespace;
/// A Title object for the page that is being parsed.
private static $currentTitle;
/// A Title object for the target page currently being examined.
private static $targetTitle;
// The TitleValue object of the target page
private static $targetTitleValue;
/// The content object for the currently processed target page.
/// This variable is necessary to be able to prevent loading the target
/// content twice.
private static $targetContent;
/// Holds the page title of the currently processed target page
/// as a string.
private static $targetTitleText;
/// Delimiter used in a regexp split operation to seperate those parts
/// of the page that should be parsed from those that should not be
/// parsed (e.g. inside pre-existing links etc.).
private static $delimiter;
private static $wordStartDelim;
private static $wordEndDelim;
public static $ltConsoleOutput;
public static $ltConsoleOutputDebug;
/// Setup method
public static function setup() {
self::BuildDelimiters();
}
/// Event handler that is hooked to the PageContentSave event.
public static function onPageContentSave( &$wikiPage, &$user, &$content, &$summary,
$isMinor, $isWatch, $section, &$flags, &$status ) {
global $wgLinkTitlesParseOnEdit;
global $wgLinkTitlesNamespaces;
if ( !$wgLinkTitlesParseOnEdit ) return true;
if ( !$isMinor ) {
$title = $wikiPage->getTitle();
// Only process if page is in one of our namespaces we want to link
// Fixes ugly autolinking of sidebar pages
if ( in_array( $title->getNamespace(), $wgLinkTitlesNamespaces )) {
$text = $content->getContentHandler()->serializeContent( $content );
if ( !\MagicWord::get( 'MAG_LINKTITLES_NOAUTOLINKS' )->match( $text ) ) {
$newText = self::parseContent( $title, $text );
if ( $newText != $text ) {
$content = $content->getContentHandler()->unserializeContent( $newText );
}
}
}
};
return true;
}
/// Event handler that is hooked to the InternalParseBeforeLinks event.
/// @param Parser $parser Parser that raised the event.
/// @param $text Preprocessed text of the page.
public static function onInternalParseBeforeLinks( \Parser &$parser, &$text ) {
global $wgLinkTitlesParseOnRender;
if (!$wgLinkTitlesParseOnRender) return true;
global $wgLinkTitlesNamespaces;
$title = $parser->getTitle();
// If the page contains the magic word '__NOAUTOLINKS__', do not parse it.
// Only process if page is in one of our namespaces we want to link
if ( !\MagicWord::get( 'MAG_LINKTITLES_NOAUTOLINKS' )->match( $text ) &&
in_array( $title->getNamespace(), $wgLinkTitlesNamespaces ) ) {
$text = self::parseContent( $title, $text );
}
return true;
}
/// Core function of the extension, performs the actual parsing of the content.
/// @param Parser $parser Parser instance for the current page
/// @param $text String that holds the article content
/// @returns string: parsed text with links added if needed
private static function parseContent( $title, &$text ) {
// Configuration variables need to be defined here as globals.
global $wgLinkTitlesFirstOnly;
global $wgLinkTitlesSmartMode;
global $wgCapitalLinks;
( $wgLinkTitlesFirstOnly ) ? $limit = 1 : $limit = -1;
$limitReached = false;
self::$currentTitle = $title;
$currentNamespace = $title->getNamespace();
$newText = $text;
if ( !isset( self::$pageTitles ) || ( $currentNamespace != self::$currentNamespace ) ) {
self::$currentNamespace = $currentNamespace;
self::$pageTitles = self::fetchPageTitles( $currentNamespace );
}
// Iterate through the page titles
foreach( self::$pageTitles as $row ) {
self::newTarget( $row->page_namespace, $row->page_title );
// Don't link current page
if ( self::$targetTitle->equals( self::$currentTitle ) ) { continue; }
// split the page content by [[...]] groups
// credits to inhan @ StackOverflow for suggesting preg_split
// see http://stackoverflow.com/questions/10672286
$arr = preg_split( self::$delimiter, $newText, -1, PREG_SPLIT_DELIM_CAPTURE );
// Escape certain special characters in the page title to prevent
// regexp compilation errors
self::$targetTitleText = self::$targetTitle->getText();
$quotedTitle = preg_quote( self::$targetTitleText, '/' );
self::ltDebugLog( 'TargetTitle='. self::$targetTitleText, 'private' );
self::ltDebugLog( 'TargetTitleQuoted='. $quotedTitle, 'private' );
// Depending on the global configuration setting $wgCapitalLinks,
// the title has to be searched for either in a strictly case-sensitive
// way, or in a 'fuzzy' way where the first letter of the title may
// be either case.
if ( $wgCapitalLinks && ( $quotedTitle[0] != '\\' )) {
$searchTerm = '((?i)' . $quotedTitle[0] . '(?-i)' .
substr($quotedTitle, 1) . ')';
} else {
$searchTerm = '(' . $quotedTitle . ')';
}
$regex = '/(?<![\:\.\@\/\?\&])' . self::$wordStartDelim .
$searchTerm . self::$wordEndDelim . '/S';
for ( $i = 0; $i < count( $arr ); $i+=2 ) {
// even indexes will point to text that is not enclosed by brackets
$arr[$i] = preg_replace_callback( $regex,
'LinkTitles\Extension::simpleModeCallback', $arr[$i], $limit, $count );
if ( $wgLinkTitlesFirstOnly && ( $count > 0 ) ) {
$limitReached = true;
break;
};
};
$newText = implode( '', $arr );
// If smart mode is turned on, the extension will perform a second
// pass on the page and add links with aliases where the case does
// not match.
if ( $wgLinkTitlesSmartMode && !$limitReached ) {
$arr = preg_split( self::$delimiter, $newText, -1, PREG_SPLIT_DELIM_CAPTURE );
for ( $i = 0; $i < count( $arr ); $i+=2 ) {
// even indexes will point to text that is not enclosed by brackets
$arr[$i] = preg_replace_callback( '/(?<![\:\.\@\/\?\&])' .
self::$wordStartDelim . '(' . $quotedTitle . ')' .
self::$wordEndDelim . '/iS', 'LinkTitles\Extension::smartModeCallback',
$arr[$i], $limit, $count );
if ( $wgLinkTitlesFirstOnly && ( $count > 0 )) {
break;
};
};
$newText = implode( '', $arr );
} // $wgLinkTitlesSmartMode
}; // foreach $res as $row
return $newText;
}
/// Automatically processes a single page, given a $title Title object.
/// This function is called by the SpecialLinkTitles class and the
/// LinkTitlesJob class.
/// @param Title $title Title object.
/// @param RequestContext $context Current request context.
/// If in doubt, call MediaWiki's `RequestContext::getMain()`
/// to obtain such an object.
/// @returns boolean True if the page exists, false if the page does not exist
public static function processPage( \Title $title, \RequestContext $context ) {
self::ltLog('Processing '. $title->getPrefixedText());
$page = \WikiPage::factory($title);
$content = $page->getContent();
if ( $content != null ) {
$text = $content->getContentHandler()->serializeContent($content);
$newText = self::parseContent($title, $text);
if ( $text != $newText ) {
$content = $content->getContentHandler()->unserializeContent( $newText );
$page->doEditContent(
$content,
"Links to existing pages added by LinkTitles bot.", // TODO: i18n
EDIT_MINOR | EDIT_FORCE_BOT,
false, // baseRevId
$context->getUser()
);
};
return true;
}
else {
return false;
}
}
/// Adds the two magic words defined by this extension to the list of
/// 'double-underscore' terms that are automatically removed before a
/// page is displayed.
/// @param $doubleUnderscoreIDs Array of magic word IDs.
/// @return true
public static function onGetDoubleUnderscoreIDs( array &$doubleUnderscoreIDs ) {
$doubleUnderscoreIDs[] = 'MAG_LINKTITLES_NOTARGET';
$doubleUnderscoreIDs[] = 'MAG_LINKTITLES_NOAUTOLINKS';
return true;
}
public static function onParserFirstCallInit( \Parser $parser ) {
$parser->setHook( 'noautolinks', 'LinkTitles\Extension::doNoautolinksTag' );
$parser->setHook( 'autolinks', 'LinkTitles\Extension::doAutolinksTag' );
}
/// Removes the extra tag that this extension provides (<noautolinks>)
/// by simply returning the text between the tags (if any).
/// See https://www.mediawiki.org/wiki/Manual:Tag_extensions#Example
public static function doNoautolinksTag( $input, array $args, \Parser $parser, \PPFrame $frame ) {
return htmlspecialchars( $input );
}
/// Removes the extra tag that this extension provides (<noautolinks>)
/// by simply returning the text between the tags (if any).
/// See https://www.mediawiki.org/wiki/Manual:Tag_extensions#How_do_I_render_wikitext_in_my_extension.3F
public static function doAutolinksTag( $input, array $args, \Parser $parser, \PPFrame $frame ) {
$withLinks = self::parseContent( $parser->getTitle(), $input );
$output = $parser->recursiveTagParse( $withLinks, $frame );
return $output;
}
// Fetches the page titles from the database.
// @param $currentNamespace String holding the namespace of the page currently being processed.
private static function fetchPageTitles( $currentNamespace ) {
global $wgLinkTitlesPreferShortTitles;
global $wgLinkTitlesMinimumTitleLength;
global $wgLinkTitlesBlackList;
global $wgLinkTitlesNamespaces;
( $wgLinkTitlesPreferShortTitles ) ? $sort_order = 'ASC' : $sort_order = 'DESC';
// Build a blacklist of pages that are not supposed to be link
// targets. This includes the current page.
$blackList = str_replace( ' ', '_', '("' . implode( '","',$wgLinkTitlesBlackList ) . '")' );
// Build our weight list. Make sure current namespace is first element
$namespaces = array_diff( $wgLinkTitlesNamespaces, [ $currentNamespace ] );
array_unshift( $namespaces, $currentNamespace );
// No need for sanitiy check. we are sure that we have at least one element in the array
$weightSelect = "CASE page_namespace ";
$currentWeight = 0;
foreach ($namespaces as &$namspacevalue) {
$currentWeight = $currentWeight + 100;
$weightSelect = $weightSelect . " WHEN " . $namspacevalue . " THEN " . $currentWeight . PHP_EOL;
}
$weightSelect = $weightSelect . " END ";
$namespacesClause = '(' . implode( ', ', $namespaces ) . ')';
// Build an SQL query and fetch all page titles ordered by length from
// shortest to longest. Only titles from 'normal' pages (namespace uid
// = 0) are returned. Since the db may be sqlite, we need a try..catch
// structure because sqlite does not support the CHAR_LENGTH function.
$dbr = wfGetDB( DB_SLAVE );
try {
$res = $dbr->select(
'page',
array( 'page_title', 'page_namespace' , "weight" => $weightSelect),
array(
'page_namespace IN ' . $namespacesClause,
'CHAR_LENGTH(page_title) >= ' . $wgLinkTitlesMinimumTitleLength,
'page_title NOT IN ' . $blackList,
),
__METHOD__,
array( 'ORDER BY' => 'weight ASC, CHAR_LENGTH(page_title) ' . $sort_order )
);
} catch (Exception $e) {
$res = $dbr->select(
'page',
array( 'page_title', 'page_namespace' , "weight" => $weightSelect ),
array(
'page_namespace IN ' . $namespacesClause,
'LENGTH(page_title) >= ' . $wgLinkTitlesMinimumTitleLength,
'page_title NOT IN ' . $blackList,
),
__METHOD__,
array( 'ORDER BY' => 'weight ASC, LENGTH(page_title) ' . $sort_order )
);
}
return $res;
}
// Build an anonymous callback function to be used in simple mode.
private static function simpleModeCallback( array $matches ) {
if ( self::checkTargetPage() ) {
self::ltLog( "Linking '$matches[0]' to '" . self::$targetTitle . "'" );
return '[[' . $matches[0] . ']]';
}
else
{
return $matches[0];
}
}
// Callback function for use with preg_replace_callback.
// This essentially performs a case-sensitive comparison of the
// current page title and the occurrence found on the page; if
// the cases do not match, it builds an aliased (piped) link.
// If $wgCapitalLinks is set to true, the case of the first
// letter is ignored by MediaWiki and we don't need to build a
// piped link if only the case of the first letter is different.
private static function smartModeCallback( array $matches ) {
global $wgCapitalLinks;
if ( $wgCapitalLinks ) {
// With $wgCapitalLinks set to true we have a slightly more
// complicated version of the callback than if it were false;
// we need to ignore the first letter of the page titles, as
// it does not matter for linking.
if ( self::checkTargetPage() ) {
self::ltLog( "Linking (smart) '$matches[0]' to '" . self::$targetTitle . "'" );
if ( strcmp(substr(self::$targetTitleText, 1), substr($matches[0], 1)) == 0 ) {
// Case-sensitive match: no need to bulid piped link.
return '[[' . $matches[0] . ']]';
} else {
// Case-insensitive match: build piped link.
return '[[' . self::$targetTitleText . '|' . $matches[0] . ']]';
}
}
else
{
return $matches[0];
}
} else {
// If $wgCapitalLinks is false, we can use the simple variant
// of the callback function.
if ( self::checkTargetPage() ) {
self::ltLog( "Linking (smart) '$matches[0]' to '" . self::$targetTitle . "'" );
if ( strcmp(self::$targetTitleText, $matches[0]) == 0 ) {
// Case-sensitive match: no need to bulid piped link.
return '[[' . $matches[0] . ']]';
} else {
// Case-insensitive match: build piped link.
return '[[' . self::$targetTitleText . '|' . $matches[0] . ']]';
}
}
else
{
return $matches[0];
}
}
}
/// Sets member variables for the current target page.
private static function newTarget( $ns, $title ) {
self::$targetTitle = \Title::makeTitleSafe( $ns, $title );
self::ltDebugLog( 'newtarget='. self::$targetTitle->getText(), "private" );
self::$targetTitleValue = self::$targetTitle->getTitleValue();
self::ltDebugLog( 'altTarget='. self::$targetTitleValue->getText(), "private" );
self::$targetContent = null;
}
/// Returns the content of the current target page.
/// This function serves to be used in preg_replace_callback callback
/// functions, in order to load the target page content from the
/// database only when needed.
/// @note It is absolutely necessary that the newTarget()
/// function is called for every new page.
private static function getTargetContent() {
if ( ! isset( $targetContent ) ) {
self::$targetContent = \WikiPage::factory(
self::$targetTitle)->getContent();
};
return self::$targetContent;
}
/// Examines the current target page. Returns true if it may be linked;
/// false if not. This depends on the settings
/// $wgLinkTitlesCheckRedirect and $wgLinkTitlesEnableNoTargetMagicWord
/// and whether the target page is a redirect or contains the
/// __NOAUTOLINKTARGET__ magic word.
/// @returns boolean
private static function checkTargetPage() {
global $wgLinkTitlesEnableNoTargetMagicWord;
global $wgLinkTitlesCheckRedirect;
// If checking for redirects is enabled and the target page does
// indeed redirect to the current page, return the page title as-is
// (unlinked).
if ( $wgLinkTitlesCheckRedirect ) {
$redirectTitle = self::getTargetContent()->getUltimateRedirectTarget();
if ( $redirectTitle && $redirectTitle->equals(self::$currentTitle) ) {
return false;
}
};
// If the magic word __NOAUTOLINKTARGET__ is enabled and the target
// page does indeed contain this magic word, return the page title
// as-is (unlinked).
if ( $wgLinkTitlesEnableNoTargetMagicWord ) {
if ( self::getTargetContent()->matchMagicWord(
\MagicWord::get('MAG_LINKTITLES_NOTARGET') ) ) {
return false;
}
};
return true;
}
/// Builds the delimiter that is used in a regexp to separate
/// text that should be parsed from text that should not be
/// parsed (e.g. inside existing links etc.)
private static function BuildDelimiters() {
// Configuration variables need to be defined here as globals.
global $wgLinkTitlesParseHeadings;
global $wgLinkTitlesSkipTemplates;
global $wgLinkTitlesWordStartOnly;
global $wgLinkTitlesWordEndOnly;
// Use unicode character properties rather than \b escape sequences
// to detect whole words containing non-ASCII characters as well.
// Note that this requires a PCRE library that was compiled with
// --enable-unicode-properties
( $wgLinkTitlesWordStartOnly ) ? self::$wordStartDelim = '(?<!\pL)' : self::$wordStartDelim = '';
( $wgLinkTitlesWordEndOnly ) ? self::$wordEndDelim = '(?!\pL)' : self::$wordEndDelim = '';
if ( $wgLinkTitlesSkipTemplates )
{
// Use recursive regex to balance curly braces;
// see http://www.regular-expressions.info/recurse.html
$templatesDelimiter = '{{(?>[^{}]|(?R))*}}|';
} else {
// Match template names (ignoring any piped [[]] links in them)
// along with the trailing pipe and parameter name or closing
// braces; also match sequences of '|wordcharacters=' (without
// spaces in them) that usually only occur as parameter names in
// transclusions (but could also occur as wiki table cell contents).
// TODO: Find a way to match parameter names in transclusions, but
// not in table cells or other sequences involving a pipe character
// and equal sign.
$templatesDelimiter = '{{[^|]*?(?:(?:\[\[[^]]+]])?)[^|]*?(?:\|(?:\w+=)?|(?:}}))|\|\w+=|';
}
// Build a regular expression that will capture existing wiki links ("[[...]]"),
// wiki headings ("= ... =", "== ... ==" etc.),
// urls ("http://example.com", "[http://example.com]", "[http://example.com Description]",
// and email addresses ("mail@example.com").
// Since there is a user option to skip headings, we make this part of the expression
// optional. Note that in order to use preg_split(), it is important to have only one
// capturing subpattern (which precludes the use of conditional subpatterns).
( $wgLinkTitlesParseHeadings ) ? $delimiter = '' : $delimiter = '=+.+?=+|';
$urlPattern = '[a-z]+?\:\/\/(?:\S+\.)+\S+(?:\/.*)?';
self::$delimiter = '/(' . // exclude from linking:
'\[\[.*?\]\]|' . // links
$delimiter . // titles (if requested)
$templatesDelimiter . // templates (if requested)
'^ .+?\n|\n .+?\n|\n .+?$|^ .+?$|' . // preformatted text
'<nowiki>.*?<.nowiki>|<code>.*?<\/code>|' . // nowiki/code
'<pre>.*?<\/pre>|<html>.*?<\/html>|' . // pre/html
'<script>.*?<\/script>|' . // script
'<gallery>.*?<\/gallery>|' . // gallery
'<div.+?>|<\/div>|' . // attributes of div elements
'<span.+?>|<\/span>|' . // attributes of span elements
'<file>[^<]*<\/file>|' . // stuff inside file elements
'style=".+?"|class=".+?"|' . // styles and classes (e.g. of wikitables)
'<noautolinks>.*?<\/noautolinks>|' . // custom tag 'noautolinks'
'\[' . $urlPattern . '\s.+?\]|'. $urlPattern . '(?=\s|$)|' . // urls
'(?<=\b)\S+\@(?:\S+\.)+\S+(?=\b)' . // email addresses
')/ismS';
}
/// Local Debugging output function which can send output to console as well
public static function ltDebugLog($text) {
if ( self::$ltConsoleOutputDebug ) {
print $text . "\n";
}
wfDebugLog( 'LinkTitles', $text , 'private' );
}
/// Local Logging output function which can send output to console as well
public static function ltLog($text) {
if (self::$ltConsoleOutput) {
print $text . "\n";
}
wfDebugLog( 'LinkTitles', $text , 'private' );
}
}
// vim: ts=2:sw=2:noet:comments^=\:///

202
includes/Linker.php Normal file
View File

@ -0,0 +1,202 @@
<?php
/**
* The LinkTitles\Linker class does the heavy linking for the extension.
*
* Copyright 2012-2017 Daniel Kraus <bovender@bovender.de> ('bovender')
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*
* @author Daniel Kraus <bovender@bovender.de>
*/
namespace LinkTitles;
/**
* Performs the actual linking of content to existing pages.
*/
class Linker {
/**
* LinkTitles configuration.
*
* @var Config $config
*/
public $config;
/**
* The link value of the target page that is currently being evaluated.
* This may be either the page name or the page name prefixed with the
* name space if the target's name space is not NS_MAIN.
*
* This is an instance variable (rather than a local method variable) so it
* can be accessed in the preg_replace_callback callbacks.
*
* @var String $linkValue
*/
private $linkValue;
/**
* Constructs a new instance of the Linker class.
*
* @param Config $config LinkTitles configuration object.
*/
public function __construct( Config &$config ) {
$this->config = $config;
}
/**
* Core function of the extension, performs the actual parsing of the content.
*
* This method receives a Title object and the string representation of the
* source page. It does not work on a WikiPage object directly because the
* callbacks in the Extension class do not always get a WikiPage object in the
* first place.
*
* @param \Title &$title Title object for the current page.
* @param String $text String that holds the article content
* @return String|null Source page text with links to target pages, or null if no links were added
*/
public function linkContent( Source $source ) {
if ( !$source->canBeLinked() ) {
return;
}
( $this->config->firstOnly ) ? $limit = 1 : $limit = -1;
$limitReached = false;
$newLinks = false; // whether or not new links were added
$newText = $source->getText();
$splitter = Splitter::singleton( $this->config );
$targets = Targets::singleton( $source->getTitle(), $this->config );
// Iterate through the target page titles
foreach( $targets->queryResult as $row ) {
$target = new Target( $row->page_namespace, $row->page_title, $this->config );
// Don't link current page and don't link if the target page redirects
// to the current page or has the __NOAUTOLINKTARGET__ magic word
// (as required by the actual LinkTitles configuration).
if ( $target->isSameTitle( $source ) || !$target->mayLinkTo( $source ) ) {
continue;
}
// Dealing with existing links if the firstOnly option is set:
// A link to the current page should only be recognized if it appears in
// clear text, i.e. we do not count piped links as existing links.
// (Similarly, by design, redirections should not be counted as existing links.)
if ( $limit == 1 && preg_match( '/[[' . $target->getCaseSensitiveLinkValueRegex() . ']]/' , $source->getText() ) ) {
continue;
}
// Split the page content by non-linkable sections.
// Credits to inhan @ StackOverflow for suggesting preg_split.
// See http://stackoverflow.com/questions/10672286
$arr = $splitter->split( $newText );
$count = 0;
// Cache the target title text for the regex callbacks
$this->linkValue = $target->getPrefixedTitleText();
// Even indexes will point to sections of the text that may be linked
for ( $i = 0; $i < count( $arr ); $i += 2 ) {
$arr[$i] = preg_replace_callback( $target->getCaseSensitiveRegex(),
array( $this, 'simpleModeCallback'),
$arr[$i], $limit, $count );
if ( $this->config->firstOnly && ( $count > 0 ) ) {
$limitReached = true;
break;
};
};
if ( $count > 0 ) {
$newLinks = true;
$newText = implode( '', $arr );
}
// If smart mode is turned on, the extension will perform a second
// pass on the page and add links with aliases where the case does
// not match.
if ( $this->config->smartMode && !$limitReached ) {
if ( $count > 0 ) {
// Split the text again because it was changed in the first pass.
$arr = $splitter->split( $newText );
}
for ( $i = 0; $i < count( $arr ); $i+=2 ) {
// even indexes will point to text that is not enclosed by brackets
$arr[$i] = preg_replace_callback( $target->getCaseInsensitiveRegex(),
array( $this, 'smartModeCallback'),
$arr[$i], $limit, $count );
if ( $this->config->firstOnly && ( $count > 0 )) {
break;
};
};
if ( $count > 0 ) {
$newLinks = true;
$newText = implode( '', $arr );
}
} // $wgLinkTitlesSmartMode
}; // foreach $res as $row
if ( $newLinks ) {
return $newText;
}
}
/**
* Callback for preg_replace_callback in simple mode.
*
* @param array $matches Matches provided by preg_replace_callback
* @return string Target page title with or without link markup
*/
private function simpleModeCallback( array $matches ) {
// If the link value is longer than the match, it must be prefixed with
// a namespace. In this case, we build a piped link.
if ( strlen( $this->linkValue ) > strlen( $matches[0] ) ) {
return '[[' . $this->linkValue . '|' . $matches[0] . ']]';
} else {
return '[[' . $matches[0] . ']]';
}
}
/**
* Callback function for use with preg_replace_callback.
* This essentially performs a case-sensitive comparison of the
* current page title and the occurrence found on the page; if
* the cases do not match, it builds an aliased (piped) link.
* If $wgCapitalLinks is set to true, the case of the first
* letter is ignored by MediaWiki and we don't need to build a
* piped link if only the case of the first letter is different.
*
* @param array $matches Matches provided by preg_replace_callback
* @return string Target page title with or without link markup
*/
private function smartModeCallback( array $matches ) {
// If cases of the target page title and the actual occurrence in the text
// are not identical, we need to build a piped link.
// How case-identity is determined depends on the $wgCapitalLinks setting:
// with $wgCapitalLinks = true, the case of first letter of the title is
// not significant.
if ( $this->config->capitalLinks ) {
$needPipe = strcmp( substr( $this->linkValue, 1 ), substr( $matches[ 0 ], 1 ) ) != 0;
} else {
$needPipe = strcmp( $this->linkValue, $matches[ 0 ] ) != 0;
}
if ( $needPipe ) {
return '[[' . $this->linkValue . '|' . $matches[ 0 ] . ']]';
} else {
return '[[' . $matches[ 0 ] . ']]';
}
}
}
// vim: ts=2:sw=2:noet:comments^=\:///

245
includes/Source.php Normal file
View File

@ -0,0 +1,245 @@
<?php
/**
* The LinkTitles\Source represents a Wiki page to which links may be added.
*
* Copyright 2012-2017 Daniel Kraus <bovender@bovender.de> ('bovender')
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*
* @author Daniel Kraus <bovender@bovender.de>
*/
namespace LinkTitles;
/**
* Represents a page that is a potential link target.
*/
class Source {
/**
* The LinKTitles configuration for this Source.
*
* @var Config $config
*/
public $config;
private $title;
private $text;
private $page;
private $content;
/**
* Creates a Source object from a \Title.
* @param \Title $title Title object from which to create the Source.
* @return Source Source object created from the title.
*/
public static function createFromTitle( \Title $title, Config $config ) {
$source = new Source( $config );
$source->title = $title;
return $source;
}
/**
* Creates a Source object with a given Title and a text.
*
* This factory can be called e.g. from a onPageContentSave event handler
* which knows both these parameters.
*
* @param \Title $title Title of the source page
* @param String $text String representation of the page content
* @param Config $config LinkTitles configuration
* @return Source Source object created from the title and the text
*/
public static function createFromTitleAndText( \Title $title, $text, Config $config ) {
$source = Source::createFromTitle( $title, $config);
$source->text = $text;
return $source;
}
/**
* Creates a Source object with a given WikiPage and a Content.
*
* This factory can be called e.g. from an onPageContentSave event handler
* which knows both these parameters.
*
* @param \WikiPage $page WikiPage to link from
* @param \Content $content Page content
* @param Config $config LinkTitles configuration
* @return Source Source object created from the title and the text
*/
public static function createFromPageandContent( \WikiPage $page, \Content $content, Config $config ) {
$source = new Source( $config );
$source->page = $page;
$source->content = $content;
return $source;
}
/**
* Creates a Source object with a given Parser.
*
* @param \Parser $parser Parser object from which to create the Source.
* @param Config $config LinKTitles Configuration
* @return Source Source object created from the parser and the text.
*/
public static function createFromParser( \Parser $parser, Config $config ) {
$source = new Source( $config );
$source->title = $parser->getTitle();
return $source;
}
/**
* Creates a Source object with a given Parser and text.
*
* This factory can be called e.g. from an onInternalParseBeforeLinks event
* handler which knows these parameters.
*
* @param \Parser $parser Parser object from which to create the Source.
* @param String $text String representation of the page content.
* @param Config $config LinKTitles Configuration
* @return Source Source object created from the parser and the text.
*/
public static function createFromParserAndText( \Parser $parser, $text, Config $config ) {
$source = Source::createFromParser( $parser, $config );
$source->text = $text;
return $source;
}
/**
* Private constructor. Use one of the factories to created a Source object.
* @param Config $config LinkTitles configuration
*/
private function __construct( Config $config) {
$this->config = $config;
}
/**
* Determines whether or not this page may be linked to.
* @return [type] [description]
*/
public function canBeLinked() {
return $this->hasDesiredNamespace() && !$this->hasNoAutolinksMagicWord();
}
/**
* Determines whether the Source is in a desired namespace, i.e. a namespace
* that is listed in the sourceNamespaces config setting or is NS_MAIN.
* @return boolean True if the Source is in a 'good' namespace.
*/
public function hasDesiredNamespace() {
return in_array( $this->getTitle()->getNamespace(), $this->config->sourceNamespaces );
}
/**
* Determines whether the source page contains the __NOAUTOLINKS__ magic word.
*
* @return boolean True if the page contains the __NOAUTOLINKS__ magic word.
*/
public function hasNoAutolinksMagicWord() {
return \MagicWord::get( 'MAG_LINKTITLES_NOAUTOLINKS' )->match( $this->getText() );
}
/**
* Gets the title.
*
* @return \Title Title of the source page.
*/
public function getTitle() {
if ( $this->title === null ) {
// Access the property directly to avoid an infinite loop.
if ( $this->page != null) {
$this->title = $this->page->getTitle();
} else {
throw new Exception( 'Unable to create Title for this Source because Page is null.' );
}
}
return $this->title;
}
/**
* Gets the namespace of the source Title.
* @return integer namespace index.
*/
public function getNamespace() {
return $this->getTitle()->getNamespace();
}
/**
* Gets the Content object for the source page.
*
* The value is cached.
*
* @return \Content Content object.
*/
public function getContent() {
if ( $this->content === null ) {
$this->content = $this->getPage()->getContent();
}
return $this->content;
}
/**
* Determines whether the source page has content.
*
* @return boolean True if the source page has content.
*/
public function hasContent() {
return $this->getContent() != null;
}
/**
* Gets the text of the corresponding Wiki page.
*
* The value is cached.
*
* @return String Text of the Wiki page.
*/
public function getText() {
if ( $this->text === null ) {
$content = $this->getContent();
$this->text = $content->getContentHandler()->serializeContent( $content );
}
return $this->text;
}
/**
* Unserializes text to the page's content.
*
* @param String $text Text to unserialize.
* @return \Content The source's updated content object.
*/
public function setText( $text ) {
$this->content = $this->content->getContentHandler()->unserializeContent( $text );
$this->text = $text;
return $this->content;
}
/**
* Returns the source page object.
* @return \WikiPage WikiPage for the source title.
*/
public function getPage() {
if ( $this->page === null ) {
// Access the property directly to avoid an infinite loop.
if ( $this->title != null) {
$this->page = \WikiPage::factory( $this->title );
} else {
throw new Exception( 'Unable to create Page for this Source because Title is null.' );
}
}
return $this->page;
}
}

View File

@ -1,21 +1,25 @@
<?php
/*
* Copyright 2012-2017 Daniel Kraus <bovender@bovender.de> ('bovender')
/**
* Provides a special page for the LinkTitles extension.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
* Copyright 2012-2017 Daniel Kraus <bovender@bovender.de> ('bovender')
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*
* @author Daniel Kraus <bovender@bovender.de>
*/
namespace LinkTitles;
/// @defgroup batch Batch processing
@ -25,29 +29,37 @@ if ( !defined( 'MEDIAWIKI' ) ) {
die( 'Not an entry point.' );
}
/// @endcond
/// Provides a special page that can be used to batch-process all pages in
/// the wiki. By default, this can only be performed by sysops.
/// @ingroup batch
class Special extends \SpecialPage {
/// Constructor. Announces the special page title and required user right
/// to the parent constructor.
/**
* Provides a special page that can be used to batch-process all pages in
* the wiki. By default, this can only be performed by sysops.
* @ingroup batch
*
*/
class Special extends \SpecialPage {
private $config;
/**
* Constructor. Announces the special page title and required user right to the parent constructor.
*/
function __construct() {
// the second parameter in the following function call ensures that only
// users who have the 'linktitles-batch' right get to see this page (by
// the second parameter in the following function call ensures that only
// users who have the 'linktitles-batch' right get to see this page (by
// default, this are all sysop users).
parent::__construct( 'LinkTitles', 'linktitles-batch' );
$this->config = new Config();
}
function getGroupName() {
return 'pagetools';
}
/// Entry function of the special page class. Will abort if the user does
/// not have appropriate permissions ('linktitles-batch').
/// @return undefined
function execute($par) {
/**
* Entry function of the special page class. Will abort if the user does not have appropriate permissions ('linktitles-batch').
* @param $par Additional parameters (required by interface; currently not used)
*/
function execute( $par ) {
// Prevent non-authorized users from executing the batch processing.
if ( !$this->userCanExecute( $this->getUser() ) ) {
$this->displayRestrictionError();
@ -76,57 +88,53 @@ class Special extends \SpecialPage {
}
}
/// Processes wiki articles, starting at the page indicated by
/// $startTitle. If $wgLinkTitlesTimeLimit is reached before all pages are
/// processed, returns the title of the next page that needs processing.
/// @param WebRequest $request WebRequest object that is associated with the special
/// page.
/// @param OutputPage $output Output page for the special page.
/**
* Processes wiki articles, starting at the page indicated by
* $startTitle. If $wgLinkTitlesTimeLimit is reached before all pages are
* processed, returns the title of the next page that needs processing.
* @param WebRequest $request WebRequest object that is associated with the special page.
* @param OutputPage $output Output page for the special page.
*/
private function process( \WebRequest &$request, \OutputPage &$output) {
global $wgLinkTitlesTimeLimit;
global $wgLinkTitlesNamespaces;
// get our Namespaces
$namespacesClause = str_replace( '_', ' ','(' . implode( ', ',$wgLinkTitlesNamespaces ) . ')' );
// get our Namespaces
$namespacesClause = str_replace( '_', ' ','(' . implode( ', ',$this->config->sourceNamespaces ) . ')' );
// Start the stopwatch
$startTime = microtime(true);
$startTime = microtime( true );
// Connect to the database
$dbr = wfGetDB( DB_SLAVE );
// Fetch the start index and max number of records from the POST
// Fetch the start index and max number of records from the POST
// request.
$postValues = $request->getValues();
// Convert the start index to an integer; this helps preventing
// SQL injection attacks via forged POST requests.
$start = intval($postValues['s']);
$start = intval( $postValues['s'] );
// If an end index was given, we don't need to query the database
if ( array_key_exists('e', $postValues) ) {
$end = intval($postValues['e']);
if ( array_key_exists( 'e', $postValues ) ) {
$end = intval( $postValues['e'] );
}
else
else
{
// No end index was given. Therefore, count pages now.
$end = $this->countPages($dbr, $namespacesClause );
$end = $this->countPages( $dbr, $namespacesClause );
};
array_key_exists('r', $postValues) ?
$reloads = $postValues['r'] :
$reloads = 0;
array_key_exists( 'r', $postValues ) ? $reloads = $postValues['r'] : $reloads = 0;
// Retrieve page names from the database.
$res = $dbr->select(
$res = $dbr->select(
'page',
array('page_title', 'page_namespace'),
array(
'page_namespace IN ' . $namespacesClause,
),
__METHOD__,
array(
'LIMIT' => 999999999,
'page_namespace IN ' . $namespacesClause,
),
__METHOD__,
array(
'LIMIT' => 999999999,
'OFFSET' => $start
)
);
@ -134,50 +142,50 @@ class Special extends \SpecialPage {
// Iterate through the pages; break if a time limit is exceeded.
foreach ( $res as $row ) {
$curTitle = \Title::makeTitleSafe( $row->page_namespace, $row->page_title);
Extension::processPage($curTitle, $this->getContext());
Extension::processPage( $curTitle, $this->getContext() );
$start += 1;
// Check if the time limit is exceeded
if ( microtime(true)-$startTime > $wgLinkTitlesTimeLimit )
if ( microtime( true ) - $startTime > $config->specialPageReloadAfter )
{
break;
}
}
$this->addProgressInfo($output, $curTitle, $start, $end);
$this->addProgressInfo( $output, $curTitle, $start, $end );
// If we have not reached the last page yet, produce code to reload
// the extension's special page.
if ( $start < $end )
{
{
$reloads += 1;
// Build a form with hidden values and output JavaScript code that
// Build a form with hidden values and output JavaScript code that
// immediately submits the form in order to continue the process.
$output->addHTML($this->getReloaderForm($request->getRequestURL(),
$start, $end, $reloads));
$output->addHTML( $this->getReloaderForm( $request->getRequestURL(),
$start, $end, $reloads) );
}
else // Last page has been processed
{
$this->addCompletedInfo($output, $start, $end, $reloads);
$this->addCompletedInfo( $output, $start, $end, $reloads );
}
}
/// Adds WikiText to the output containing information about the extension
/// and a form and button to start linking.
/*
* Adds WikiText to the output containing information about the extension
* and a form and button to start linking.
*/
private function buildInfoPage( &$request, &$output ) {
$url = $request->getRequestURL();
// TODO: Put the page contents in messages in the i18n file.
$output->addWikiText(
<<<EOF
LinkTitles extension: http://www.mediawiki.org/wiki/Extension:LinkTitles
Source code: http://github.com/bovender/LinkTitles
LinkTitles extension: https://github.com/bovender/LinkTitles
== Batch Linking ==
You can start a batch linking process by clicking on the button below.
This will go through every page in the normal namespace of your Wiki and
insert links automatically. This page will repeatedly reload itself, in
This will go through every page in the normal namespace of your Wiki and
insert links automatically. This page will repeatedly reload itself, in
order to prevent blocking the server. To interrupt the process, simply
close this page.
EOF
@ -192,12 +200,13 @@ EOF
);
}
/// Produces informative output in WikiText format to show while working.
/// @param $output Output object.
/// @param $curTitle Title of the currently processed page.
/// @param $index Index of the currently processed page.
/// @param $end Last index that will be processed (i.e., number of
/// pages).
/*
* Produces informative output in WikiText format to show while working.
* @param $output Output object.
* @param $curTitle Title of the currently processed page.
* @param $index Index of the currently processed page.
* @param $end Last index that will be processed (i.e., number of pages).
*/
private function addProgressInfo( &$output, $curTitle, $index, $end ) {
$progress = $index / $end * 100;
$percent = sprintf("%01.1f", $progress);
@ -205,8 +214,8 @@ EOF
$output->addWikiText(
<<<EOF
== Processing pages... ==
The [http://www.mediawiki.org/wiki/Extension:LinkTitles LinkTitles]
extension is currently going through every page of your wiki, adding links to
The [https://github.com/bovender/LinkTitles LinkTitles]
extension is currently going through every page of your wiki, adding links to
existing pages as appropriate.
=== Current page: $curTitle ===
@ -232,14 +241,15 @@ EOF
);
}
/// Generates an HTML form and JavaScript to automatically submit the
/// form.
/// @param $url URL to reload with a POST request.
/// @param $start Index of the next page that shall be processed.
/// @param $end Index of the last page to be processed.
/// @param $reloads Counter that holds the number of reloads so far.
/// @returns String that holds the HTML for a form and a
/// JavaScript command.
/**
* Generates an HTML form and JavaScript to automatically submit the
* form.
* @param $url URL to reload with a POST request.
* @param $start Index of the next page that shall be processed.
* @param $end Index of the last page to be processed.
* @param $reloads Counter that holds the number of reloads so far.
* @return String that holds the HTML for a form and a JavaScript command.
*/
private function getReloaderForm( $url, $start, $end, $reloads ) {
return
<<<EOF
@ -255,14 +265,15 @@ EOF
;
}
/// Adds statistics to the page when all processing is done.
/// @param $output Output object
/// @param $start Index of the first page that was processed.
/// @param $end Index of the last processed page.
/// @param $reloads Number of reloads of the page.
/// @returns undefined
/**
* Adds statistics to the page when all processing is done.
* @param $output Output object
* @param $start Index of the first page that was processed.
* @param $end Index of the last processed page.
* @param $reloads Number of reloads of the page.
* @return undefined
*/
private function addCompletedInfo( &$output, $start, $end, $reloads ) {
global $wgLinkTitlesTimeLimit;
$pagesPerReload = sprintf('%0.1f', $end / $reloads);
$output->addWikiText(
<<<EOF
@ -271,7 +282,7 @@ EOF
|-
| total number of pages: || ${end}
|-
| timeout setting [s]: || ${wgLinkTitlesTimeLimit}
| timeout setting [s]: || {$config->specialPageReloadAfter}
|-
| webpage reloads: || ${reloads}
|-
@ -281,19 +292,21 @@ EOF
);
}
/// Counts the number of pages in a read-access wiki database ($dbr).
/// @param $dbr Read-only `Database` object.
/// @returns Number of pages in the default namespace (0) of the wiki.
private function countPages(&$dbr, $namespacesClause) {
/**
* Counts the number of pages in a read-access wiki database ($dbr).
* @param $dbr Read-only `Database` object.
* @return Number of pages in the default namespace (0) of the wiki.
*/
private function countPages( &$dbr, $namespacesClause ) {
$res = $dbr->select(
'page',
array('pagecount' => "COUNT(page_id)"),
array(
'page_namespace IN ' . $namespacesClause,
),
__METHOD__
array(
'page_namespace IN ' . $namespacesClause,
),
__METHOD__
);
return $res->current()->pagecount;
}
}

147
includes/Splitter.php Normal file
View File

@ -0,0 +1,147 @@
<?php
/**
* The Splitter class caches a regular expression that delimits text to be parsed.
*
* Copyright 2012-2017 Daniel Kraus <bovender@bovender.de> ('bovender')
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*
* @author Daniel Kraus <bovender@bovender.de>
*/
namespace LinkTitles;
/**
* Caches a regular expression that delimits text to be parsed.
*/
class Splitter {
/**
* The splitting expression that separates text to be parsed from text that
* must not be parsed.
* @var String $splitter
*/
public $splitter;
/**
* The LinkTitles configuration for this Splitter instance.
* @var Config $config
*/
public $config;
private static $instance;
/**
* Gets the Splitter singleton; may build one with the given config or the
* default config if none is given.
*
* If the instance was already created, it does not matter what Config this
* method is called with. To re-create an instance with a different Config,
* call Splitter::invalidate() first.
*
* @param Config|null $config LinkTitles configuration.
*/
public static function singleton( Config &$config = null ) {
if ( self::$instance === null ) {
if ( $config === null ) {
$config = new Config();
}
self::$instance = new Splitter( $config );
}
return self::$instance;
}
/**
* Invalidates the singleton instance.
*
* Used for unit testing.
*/
public static function invalidate() {
self::$instance = null;
}
protected function __construct( Config $config) {
$this->config = $config;
$this->buildSplitter();
}
/**
* Splits a text into sections that may be linked and sections that may not
* be linked (e.g., because they already are a link, or a template, etc.).
*
* @param String &$text Text to split.
* @return Array of strings where even indexes point to linkable sections.
*/
public function split( &$text ) {
return preg_split( $this->splitter, $text, -1, PREG_SPLIT_DELIM_CAPTURE );
}
/*
* Builds the delimiter that is used in a regexp to separate
* text that should be parsed from text that should not be
* parsed (e.g. inside existing links etc.)
*/
private function buildSplitter() {
if ( $this->config->skipTemplates )
{
// Use recursive regex to balance curly braces;
// see http://www.regular-expressions.info/recurse.html
$templatesDelimiter = '{{(?>[^{}]|(?R))*}}|';
} else {
// Match template names (ignoring any piped [[]] links in them)
// along with the trailing pipe and parameter name or closing
// braces; also match sequences of '|wordcharacters=' (without
// spaces in them) that usually only occur as parameter names in
// transclusions (but could also occur as wiki table cell contents).
// TODO: Find a way to match parameter names in transclusions, but
// not in table cells or other sequences involving a pipe character
// and equal sign.
$templatesDelimiter = '{{[^|]*?(?:(?:\[\[[^]]+]])?)[^|]*?(?:\|(?:\w+=)?|(?:}}))|\|\w+=|';
}
// Build a regular expression that will capture existing wiki links ("[[...]]"),
// wiki headings ("= ... =", "== ... ==" etc.),
// urls ("http://example.com", "[http://example.com]", "[http://example.com Description]",
// and email addresses ("mail@example.com").
// Match WikiText headings.
// Since there is a user option to skip headings, we make this part of the
// expression optional. Note that in order to use preg_split(), it is
// important to have only one capturing subpattern (which precludes the use
// of conditional subpatterns).
// Caveat: This regex pattern should be improved to deal with balanced '='s
// only. However, this would require grouping in the pattern which does not
// agree with preg_split.
$headingsDelimiter = $this->config->parseHeadings ? '' : '^=+[^=]+=+$|';
$urlPattern = '[a-z]+?\:\/\/(?:\S+\.)+\S+(?:\/.*)?';
$this->splitter = '/(' . // exclude from linking:
'\[\[.*?\]\]|' . // links
$headingsDelimiter . // headings (if requested)
$templatesDelimiter . // templates (if requested)
'^ .+?\n|\n .+?\n|\n .+?$|^ .+?$|' . // preformatted text
'<nowiki>.*?<.nowiki>|<code>.*?<\/code>|' . // nowiki/code
'<pre>.*?<\/pre>|<html>.*?<\/html>|' . // pre/html
'<script>.*?<\/script>|' . // script
'<gallery>.*?<\/gallery>|' . // gallery
'<div.+?>|<\/div>|' . // attributes of div elements
'<span.+?>|<\/span>|' . // attributes of span elements
'<file>[^<]*<\/file>|' . // stuff inside file elements
'style=".+?"|class=".+?"|' . // styles and classes (e.g. of wikitables)
'<noautolinks>.*?<\/noautolinks>|' . // custom tag 'noautolinks'
'\[' . $urlPattern . '\s.+?\]|'. $urlPattern . '(?=\s|$)|' . // urls
'(?<=\b)\S+\@(?:\S+\.)+\S+(?=\b)' . // email addresses
')/ismS';
}
}

232
includes/Target.php Normal file
View File

@ -0,0 +1,232 @@
<?php
/**
* The LinkTitles\Target represents a Wiki page that is a potential link target.
*
* Copyright 2012-2017 Daniel Kraus <bovender@bovender.de> ('bovender')
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*
* @author Daniel Kraus <bovender@bovender.de>
*/
namespace LinkTitles;
/**
* Represents a page that is a potential link target.
*/
class Target {
/**
* A Title object for the target page currently being examined.
* @var \Title $title
*/
private $title;
/**
* Caches the target page content as a \Content object.
*
* @var \Content $content
*/
private $content;
/**
* Regex that matches the start of a word; this expression depends on the
* setting of LinkTitles\Config->wordStartOnly;
* @var String $wordStart
*/
public $wordStart;
/**
* Regex that matches the end of a word; this expression depends on the
* setting of LinkTitles\Config->wordEndOnly;
* @var String $wordEnd
*/
public $wordEnd;
/**
* LinkTitles configuration.
* @var Config $config
*/
private $config;
private $caseSensitiveLinkValueRegex;
private $nsText;
/**
* Constructs a new Target object
*
* The parameters may be taken from database rows, for example.
*
* @param Int $namespace Name space of the target page
* @param String &$title Title of the target page
*/
public function __construct( $namespace, $title, Config &$config ) {
// print "\n>>>namespace=$namespace;title=$title<<<\n";
$this->title = \Title::makeTitleSafe( $namespace, $title );
$this->titleValue = $this->title->getTitleValue();
$this->config = $config;
// Use unicode character properties rather than \b escape sequences
// to detect whole words containing non-ASCII characters as well.
// Note that this requires a PCRE library that was compiled with
// --enable-unicode-properties
( $config->wordStartOnly ) ? $this->wordStart = '(?<!\pL)' : $this->wordStart = '';
( $config->wordEndOnly ) ? $this->wordEnd = '(?!\pL)' : $this->wordEnd = '';
}
/**
* Gets the string representation of the target title.
* @return String title text
*/
public function getTitleText() {
return $this->title->getText();
}
public function getPrefixedTitleText() {
return $this->getNsPrefix() . $this->getTitleText();
}
/**
* Gets the string representation of the target's namespace.
*
* May be false if the namespace is NS_MAIN. The value is cached.
* @return String|bool Target's namespace
*/
public function getNsText() {
if ( $this->nsText === null ) {
$this->nsText = $this->title->getNsText();
}
return $this->nsText;
}
/**
* Gets the namespace prefix. This is the namespace text followed by a colon,
* or an empty string if the namespace text evaluates to false (e.g. NS_MAIN).
* @return String namespace prefix
*/
public function getNsPrefix() {
return $this->getNsText() ? $this->getNsText() . ':' : '';
}
/**
* Gets the title string with certain characters escaped that may interfere
* with regular expressions.
* @return String representation of the title, regex-safe
*/
public function getRegexSafeTitle() {
return preg_quote( $this->title->getText(), '/' );
}
/**
* Builds a regular expression of the title
* @return String regular expression for this title.
*/
public function getCaseSensitiveRegex() {
return $this->buildRegex( $this->getCaseSensitiveLinkValueRegex() );
}
/**
* Builds a regular expression pattern for the title in a case-insensitive
* way.
* @return String case-insensitive regular expression pattern for the title
*/
public function getCaseInsensitiveRegex() {
return $this->buildRegex( $this->getRegexSafeTitle() ) . 'i';
}
/**
* Builds the basic regex that is used to match target page titles in a source
* text.
* @param String $searchTerm Target page title (special characters must be quoted)
* @return String regular expression pattern
*/
private function buildRegex( $searchTerm ) {
return '/(?<![\:\.\@\/\?\&])' . $this->wordStart . $searchTerm . $this->wordEnd . '/S';
}
/**
* Gets the (cached) regex for the link value.
*
* Depending on the $config->capitalLinks setting, the title has to be
* searched for either in a strictly case-sensitive way, or in a 'fuzzy' way
* where the first letter of the title may be either case.
*
* @return String regular expression pattern for the link value.
*/
public function getCaseSensitiveLinkValueRegex() {
if ( $this->caseSensitiveLinkValueRegex === null ) {
$regexSafeTitle = $this->getRegexSafeTitle();
if ( $this->config->capitalLinks && preg_match( '/[a-zA-Z]/', $regexSafeTitle[0] ) ) {
$this->caseSensitiveLinkValueRegex = '((?i)' . $regexSafeTitle[0] . '(?-i)' . substr($regexSafeTitle, 1) . ')';
} else {
$this->caseSensitiveLinkValueRegex = '(' . $regexSafeTitle . ')';
}
}
return $this->caseSensitiveLinkValueRegex;
}
/**
* Returns the \Content of the target page.
*
* The value is cached.
* @return \Content Content of the Target page.
*/
public function getContent() {
if ( $this->content === null ) {
$this->content = \WikiPage::factory( $this->title )->getContent();
};
return $this->content;
}
/**
* Examines the current target page. Returns true if it may be linked;
* false if not. This depends on two settings:
* $wgLinkTitlesCheckRedirect and $wgLinkTitlesEnableNoTargetMagicWord
* and whether the target page is a redirect or contains the
* __NOAUTOLINKTARGET__ magic word.
*
* @param Source source
* @return boolean
*/
public function mayLinkTo( Source $source ) {
// If checking for redirects is enabled and the target page does
// indeed redirect to the current page, return the page title as-is
// (unlinked).
if ( $this->config->checkRedirect ) {
$redirectTitle = $this->getContent()->getUltimateRedirectTarget();
if ( $redirectTitle && $redirectTitle->equals( $source->getTitle() ) ) {
return false;
}
};
// If the magic word __NOAUTOLINKTARGET__ is enabled and the target
// page does indeed contain this magic word, return the page title
// as-is (unlinked).
if ( $this->config->enableNoTargetMagicWord ) {
if ( $this->getContent()->matchMagicWord( \MagicWord::get('MAG_LINKTITLES_NOTARGET') ) ) {
return false;
}
};
return true;
}
/**
* Determines if the Target's title is the same as another title.
* @param Source $source Source object.
* @return boolean True if the $otherTitle is the same, false if not.
*/
public function isSameTitle( Source $source) {
return $this->title->equals( $source->getTitle() );
}
}

162
includes/Targets.php Normal file
View File

@ -0,0 +1,162 @@
<?php
/**
* The LinkTitles\Targets class.
*
* Copyright 2012-2017 Daniel Kraus <bovender@bovender.de> ('bovender')
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*
* @author Daniel Kraus <bovender@bovender.de>
*/
namespace LinkTitles;
/**
* Fetches potential target page titles from the database.
*/
class Targets {
private static $instance;
/**
* Singleton factory that returns a (cached) database query results with
* potential target page titles.
*
* The subset of pages that may serve as target pages depends on the namespace
* of the source page. Therefore, if the $sourceNamespace differs from the
* cached namespace, the database is queried again.
*
* @param String $sourceNamespace The namespace of the current page.
* @param Config $config LinkTitles configuration.
*/
public static function singleton( \Title $title, Config $config ) {
if ( ( self::$instance === null ) || ( self::$instance->sourceNamespace != $title->getNamespace() ) ) {
self::$instance = new Targets( $title, $config );
}
return self::$instance;
}
/**
* Invalidates the cache; the next call of Targets::singleton() will trigger
* a database query.
*
* Use this in unit tests which are performed in a single request cycle so that
* changes to the pages list may not be picked up by the cached Targets instance.
*/
public static function invalidate() {
self::$instance = null;
}
/**
* Holds the results of a database query for target page titles, filtered
* and sorted.
* @var IResultWrapper $queryResult
*/
public $queryResult;
/**
* Holds the source page's namespace (integer) for which the list of target
* pages was built.
* @var Int $sourceNamespace
*/
public $sourceNamespace;
private $config;
/**
* The constructor is private to enforce using the singleton pattern.
* @param \Title $title
*/
private function __construct( \Title $title, Config $config) {
$this->config = $config;
$this->sourceNamespace = $title->getNamespace();
$this->fetch();
}
//
/**
* Fetches the page titles from the database.
*/
private function fetch() {
( $this->config->preferShortTitles ) ? $sortOrder = 'ASC' : $sortOrder = 'DESC';
// Build a blacklist of pages that are not supposed to be link
// targets. This includes the current page.
if ( $this->config->blackList ) {
$blackList = 'page_title NOT IN ' .
str_replace( ' ', '_', '("' . implode( '","', str_replace( '"', '\"', $this->config->blackList ) ) . '")' );
} else {
$blackList = null;
}
if ( $this->config->sameNamespace ) {
// Build our weight list. Make sure current namespace is first element
$namespaces = array_diff( $this->config->targetNamespaces, [ $this->sourceNamespace ] );
array_unshift( $namespaces, $this->sourceNamespace );
} else {
$namespaces = $this->config->targetNamespaces;
}
if ( !$namespaces) {
// If there are absolutely no target namespaces (not even the one of the
// source page), we can just return.
return;
}
// No need for sanitiy check. we are sure that we have at least one element in the array
$weightSelect = "CASE page_namespace ";
$currentWeight = 0;
foreach ($namespaces as &$namespaceValue) {
$currentWeight = $currentWeight + 100;
$weightSelect = $weightSelect . " WHEN " . $namespaceValue . " THEN " . $currentWeight . PHP_EOL;
}
$weightSelect = $weightSelect . " END ";
$namespacesClause = '(' . implode( ', ', $namespaces ) . ')';
// Build an SQL query and fetch all page titles ordered by length from
// shortest to longest. Only titles from 'normal' pages (namespace uid
// = 0) are returned. Since the db may be sqlite, we need a try..catch
// structure because sqlite does not support the CHAR_LENGTH function.
$dbr = wfGetDB( DB_SLAVE );
try {
$this->queryResult = $dbr->select(
'page',
array( 'page_title', 'page_namespace' , "weight" => $weightSelect),
array_filter(
array(
'page_namespace IN ' . $namespacesClause,
'CHAR_LENGTH(page_title) >= ' . $this->config->minimumTitleLength,
$blackList,
)
),
__METHOD__,
array( 'ORDER BY' => 'weight ASC, CHAR_LENGTH(page_title) ' . $sortOrder )
);
} catch (Exception $e) {
$this->queryResult = $dbr->select(
'page',
array( 'page_title', 'page_namespace' , "weight" => $weightSelect ),
array_filter(
array(
'page_namespace IN ' . $namespacesClause,
'LENGTH(page_title) >= ' . $this->config->minimumTitleLength,
$blackList,
)
),
__METHOD__,
array( 'ORDER BY' => 'weight ASC, LENGTH(page_title) ' . $sortOrder )
);
}
}
}

View File

@ -1,21 +1,23 @@
<?php
/*
* Copyright 2012-2017 Daniel Kraus <bovender@bovender.de> @bovender
/**
* LinkTitles command line interface (CLI)/maintenance script
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
* Copyright 2012-2017 Daniel Kraus <bovender@bovender.de> @bovender
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*/
namespace LinkTitles;
@ -44,17 +46,21 @@ else
}
};
require_once( __DIR__ . "/includes/LinkTitles_Extension.php" );
require_once( __DIR__ . "/includes/Extension.php" );
/// Core class of the maintanance script.
/// @note Note that the execution of maintenance scripts is prohibited for
/// an Apache web server due to a `.htaccess` file that declares `deny from
/// all`. Other webservers may exhibit different behavior. Be aware that
/// anybody who is able to execute this script may place a high load on the
/// server.
/// @ingroup batch
/**
* Core class of the maintanance script.
* @note Note that the execution of maintenance scripts is prohibited for
* an Apache web server due to a `.htaccess` file that declares `deny from
* all`. Other webservers may exhibit different behavior. Be aware that
* anybody who is able to execute this script may place a high load on the
* server.
* @ingroup batch
*/
class Cli extends \Maintenance {
/// The constructor adds a description and one option.
/**
* Constructor.
*/
public function __construct() {
parent::__construct();
$this->addDescription("Iterates over wiki pages and automatically adds links to other pages.");
@ -65,41 +71,45 @@ class Cli extends \Maintenance {
true, // requires argument
"s"
);
$this->addOption(
$this->addOption(
"page",
"page name to process",
false, // not required
true, // requires argument
"p"
);
$this->addOption(
"log",
"enables logging to console",
false, // not required
false, // requires no argument
"l"
);
$this->addOption(
"debug",
"enables debug logging to console",
false, // not required
false // requires no argument
);
// TODO: Add back logging options.
// TODO: Add configuration options.
// $this->addOption(
// "log",
// "enables logging to console",
// false, // not required
// false, // requires no argument
// "l"
// );
// $this->addOption(
// "debug",
// "enables debug logging to console",
// false, // not required
// false // requires no argument
// );
}
/// Main function of the maintenance script.
/// Will iterate over all pages in the wiki (starting at a certain index,
/// if the `--start` option is given) and call LinkTitles::processPage() for
/// each page.
/*
* Main function of the maintenance script.
* Will iterate over all pages in the wiki (starting at a certain index,
* if the `--start` option is given) and call LinkTitles::processPage() for
* each page.
*/
public function execute() {
if ($this->hasOption('log'))
{
Extension::$ltConsoleOutput = true;
}
if ($this->hasOption('debug'))
{
Extension::$ltConsoleOutputDebug = true;
}
// if ($this->hasOption('log'))
// {
// Extension::$ltConsoleOutput = true;
// }
// if ($this->hasOption('debug'))
// {
// Extension::$ltConsoleOutputDebug = true;
// }
if ( $this->hasOption('page') ) {
if ( !$this->hasOption( 'start' ) ) {
$this->singlePage();
@ -113,10 +123,14 @@ class Cli extends \Maintenance {
if ( $startIndex < 0 ) {
$this->error( 'FATAL: Start index must be 0 or greater.', 1 );
};
$this->allPages( $startIndex);
$this->allPages( $startIndex );
}
}
/**
* Processes a single page.
* @return bool True on success, false on failure.
*/
private function singlePage() {
$pageName = strval( $this->getOption( 'page' ) );
$this->output( "Processing single page: '$pageName'\n" );
@ -131,12 +145,17 @@ class Cli extends \Maintenance {
return $success;
}
/**
* Process all pages in the Wiki.
* @param integer $index Index of the start page.
* @return bool True on success, false on failure.
*/
private function allPages( $index = 0 ) {
global $wgLinkTitlesNamespaces;
$config = new Config();
// Retrieve page names from the database.
$dbr = $this->getDB( DB_SLAVE );
$namespacesClause = str_replace( '_', ' ','(' . implode( ', ', $wgLinkTitlesNamespaces ) . ')' );
$namespacesClause = str_replace( '_', ' ','(' . implode( ', ', $config->sourceNamespaces ) . ')' );
$res = $dbr->select(
'page',
array( 'page_title', 'page_namespace' ),

View File

@ -0,0 +1,41 @@
<?php
/**
* Copyright 2012-2017 Daniel Kraus <bovender@bovender.de> ('bovender')
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*
* @author Daniel Kraus <bovender@bovender.de>
*/
/**
* Tests the LinkTitles\Config class.
*
* This single unit test basically serves to ensure the Config class is working.
* @group bovender
* @group Database
*/
class ConfigTest extends LinkTitles\TestCase {
public function testParseOnEdit() {
$this->setMwGlobals( [
'wgLinkTitlesParseOnEdit' => true,
'wgLinkTitlesParseOnRender' => false
] );
$config = new LinkTitles\Config();
global $wgLinkTitlesParseOnEdit;
$this->assertSame( $config->parseOnEdit, $wgLinkTitlesParseOnEdit );
}
}

View File

@ -0,0 +1,48 @@
<?php
/**
* Copyright 2012-2017 Daniel Kraus <bovender@bovender.de> ('bovender')
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*
* @author Daniel Kraus <bovender@bovender.de>
*/
/**
* @group bovender
* @group Database
*/
class ExtensionTest extends LinkTitles\TestCase {
public function testParseOnEdit() {
$this->setMwGlobals( [
'wgLinkTitlesParseOnEdit' => true,
'wgLinkTitlesParseOnRender' => false
] );
$pageId = $this->insertPage( 'test page', 'This page should link to the link target but not to test page' )['id'];
$page = WikiPage::newFromId( $pageId );
$this->assertSame( 'This page should link to the [[link target]] but not to test page', self::getPageText( $page ) );
}
public function testDoNotParseOnEdit() {
$this->setMwGlobals( [
'wgLinkTitlesParseOnEdit' => false,
'wgLinkTitlesParseOnRender' => false
] );
$pageId = $this->insertPage( 'test page', 'This page should not link to the link target' )['id'];
$page = WikiPage::newFromId( $pageId );
$this->assertSame( 'This page should not link to the link target', self::getPageText( $page ) );
}
}

View File

@ -0,0 +1,256 @@
<?php
/**
* Unit tests for the Linker class, i.e. the core functionality
*
* Copyright 2012-2017 Daniel Kraus <bovender@bovender.de> ('bovender')
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*
* @author Daniel Kraus <bovender@bovender.de>
*/
/**
* Unit tests for the LinkTitles\Linker class.
*
* The test class is prefixed with 'LinkTitles' to avoid a naming collision
* with a class that exists in the MediaWiki core.
*
* (Ideally the test classes should be namespaced, but when you do that, they
* will no longer be automatically discovered.)
*
* @group bovender
* @group Database
*/
class LinkTitlesLinkerTest extends LinkTitles\TestCase {
protected $title;
protected function setUp() {
parent::setUp(); // call last to have the Targets object invalidated after inserting the page
}
public function addDBData() {
$this->title = $this->insertPage( 'source page', 'This page is the test page' )['title'];
$this->insertPage( 'link target', 'This page serves as a link target' );
parent::addDBDataOnce(); // call parent after adding page to have targets invalidated
}
/**
* @dataProvider provideLinkContentTemplatesData
*/
public function testLinkContentTemplates( $skipTemplates, $input, $expectedOutput ) {
$config = new LinkTitles\Config();
$config->firstOnly = false;
$config->skipTemplates = $skipTemplates;
LinkTitles\Splitter::invalidate();
$source = LinkTitles\Source::createFromTitleAndText( $this->title, $input, $config );
$linker = new LinkTitles\Linker( $config );
$result = $linker->linkContent( $source );
if ( !$result ) { $result = $input; }
$this->assertSame( $expectedOutput, $result );
}
public function provideLinkContentTemplatesData() {
return [
[
true, // skipTemplates
'With skipTemplates = true, a {{template|with=link target}} in it should not be linked',
'With skipTemplates = true, a {{template|with=link target}} in it should not be linked',
],
[
false, // skipTemplates
'With skipTemplates = false, a {{template|with=link target}} in it should be linked',
'With skipTemplates = false, a {{template|with=[[link target]]}} in it should be linked',
],
[
false, // skipTemplates
'With skipTemplates = false, a {{template|with=already linked [[link target]]}} in it should not be linked again',
'With skipTemplates = false, a {{template|with=already linked [[link target]]}} in it should not be linked again',
]
];
}
/**
* @dataProvider provideLinkContentSmartModeData
*/
public function testLinkContentSmartMode( $capitalLinks, $smartMode, $input, $expectedOutput ) {
$this->setMwGlobals( 'wgCapitalLinks', $capitalLinks );
$config = new LinkTitles\Config();
$config->firstOnly = false;
$config->smartMode = $smartMode;
$linker = new LinkTitles\Linker( $config );
$source = LinkTitles\Source::createFromTitleAndText( $this->title, $input, $config );
$result = $linker->linkContent( $source );
if ( !$result ) { $result = $input; }
$this->assertSame( $expectedOutput, $result );
}
public function provideLinkContentSmartModeData() {
return [
[
true, // wgCapitalLinks
true, // smartMode
'With smart mode on and $wgCapitalLinks = true, this page should link to link target',
'With smart mode on and $wgCapitalLinks = true, this page should link to [[link target]]'
],
[
true, // wgCapitalLinks
false, // smartMode
'With smart mode off and $wgCapitalLinks = true, this page should link to link target',
'With smart mode off and $wgCapitalLinks = true, this page should link to [[link target]]'
],
[
true, // wgCapitalLinks
true, // smartMode
'With smart mode on and $wgCapitalLinks = true, this page should link to Link target',
'With smart mode on and $wgCapitalLinks = true, this page should link to [[Link target]]'
],
[
true, // wgCapitalLinks
false, // smartMode
'With smart mode off and $wgCapitalLinks = true, this page should not link to Link Target',
'With smart mode off and $wgCapitalLinks = true, this page should not link to Link Target'
],
[
false, // wgCapitalLinks
true, // smartMode
'With smart mode on and $wgCapitalLinks = false, this page should link to Link target',
'With smart mode on and $wgCapitalLinks = false, this page should link to [[Link target]]'
],
[
false, // wgCapitalLinks
true, // smartMode
'With smart mode on and $wgCapitalLinks = false, this page should link to link target',
'With smart mode on and $wgCapitalLinks = false, this page should link to [[Link target|link target]]'
],
[
false, // wgCapitalLinks
false, // smartMode
'With smart mode off and $wgCapitalLinks = false, this page should not link to link target',
'With smart mode off and $wgCapitalLinks = false, this page should not link to link target'
],
[
false, // wgCapitalLinks
false, // smartMode
'With smart mode off and $wgCapitalLinks = false, this page should not link to Link target',
'With smart mode off and $wgCapitalLinks = false, this page should not link to [[Link target]]'
],
[
false, // wgCapitalLinks
true, // smartMode
'With smart mode on and $wgCapitalLinks = false, this page should link to Link target',
'With smart mode on and $wgCapitalLinks = false, this page should link to [[Link target]]'
],
[
false, // wgCapitalLinks
false, // smartMode
'With smart mode off and $wgCapitalLinks = false, this page should not link to Link Target',
'With smart mode off and $wgCapitalLinks = false, this page should not link to Link Target'
],
];
}
/**
* @dataProvider provideLinkContentFirstOnlyData
*/
public function testLinkContentFirstOnly( $firstOnly, $input, $expectedOutput ) {
$config = new LinkTitles\Config();
$config->firstOnly = $firstOnly;
$linker = new LinkTitles\Linker( $config );
$source = LinkTitles\Source::createFromTitleAndText( $this->title, $input, $config );
$result = $linker->linkContent( $source );
if ( !$result ) { $result = $input; }
$this->assertSame( $expectedOutput, $result );
}
public function provideLinkContentFirstOnlyData() {
return [
[
false, // firstOnly
'With firstOnly = false, link target is a link target multiple times',
'With firstOnly = false, [[link target]] is a [[link target]] multiple times'
],
[
false, // firstOnly
'With firstOnly = false, [[link target]] is a link target multiple times',
'With firstOnly = false, [[link target]] is a [[link target]] multiple times'
],
[
true, // firstOnly
'With firstOnly = true, link target is a link target only once',
'With firstOnly = true, [[link target]] is a link target only once'
],
[
true, // firstOnly
'With firstOnly = true, [[link target]] is a link target only once',
'With firstOnly = true, [[link target]] is a link target only once'
],
];
}
public function testLinkContentBlackList() {
$config = new LinkTitles\Config();
$config->blackList = [ 'Foo', 'Link target', 'Bar' ];
LinkTitles\Targets::invalidate();
$linker = new LinkTitles\Linker( $config );
$text = 'If the link target is blacklisted, it should not be linked';
$source = LinkTitles\Source::createFromTitleAndText( $this->title, $text, $config );
$result = $linker->linkContent( $source );
if ( !$result ) { $result = $text; }
$this->assertSame( $text, $result );
}
// Tests for namespace handling are commented out until I find a way to add
// a custom namespace during testing. (The assertTrue assertion below fails.)
// /**
// * @dataProvider provideLinkContentNamespacesData
// */
// public function testLinkContentNamespaces( $namespaces, $input, $expectedOutput ) {
// $ns = 4000;
// $this->setMwGlobals( [
// "wgExtraNamespaces[$ns]" => 'custom_namespace'
// ] );
// // global $wgExtraNamespaces;
// // global $wgContentNamespaces;
// // $wgContentNamespaces[] = $ns;
// // $wgExtraNamespaces[$ns] = 'custom_adsf';
// $this->insertPage( 'in custom namespace', 'This is a page in a custom namespace', $ns );
// $this->assertTrue( MWNamespace::exists( $ns ), "The name space with id $ns should exist!" );
// LinKTitles\Targets::invalidate();
// $config = new LinkTitles\Config();
// $config->namespaces = $namespaces;
// $linker = new LinkTitles\Linker( $config );
// $this->assertSame( $expectedOutput, $linker->linkContent( $this->title, $input ));
// }
// public function provideLinkContentNamespacesData() {
// return [
// [
// [], // namespaces
// 'With namespaces = [], page in custom namespace should not be linked',
// 'With namespaces = [], page in custom namespace should not be linked'
// ],
// [
// [ 4000 ], // namespaces
// 'With namespaces = [ 4000 ], page in custom namespace should be linked',
// 'With namespaces = [ 4000 ], page [[custom_namespace:in custom namespace]] should be linked'
// ],
// ];
// }
}

View File

@ -0,0 +1,98 @@
<?php
/**
* Copyright 2012-2017 Daniel Kraus <bovender@bovender.de> ('bovender')
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*
* @author Daniel Kraus <bovender@bovender.de>
*/
/**
* Tests the LinKTitles\Splitter class.
*
* @group bovender
*/
class SplitterTest extends MediaWikiTestCase {
/**
* @dataProvider provideSplitData
*/
public function testSplit( $skipTemplates, $parseHeadings, $input, $expectedOutput ) {
$config = new LinkTitles\Config();
$config->skipTemplates = $skipTemplates;
$config->parseHeadings = $parseHeadings;
LinkTitles\Splitter::invalidate();
$splitter = LinkTitles\Splitter::singleton( $config );
$this->assertSame( $skipTemplates, $splitter->config->skipTemplates, 'Splitter has incorrect skipTemplates config');
$this->assertSame( $parseHeadings, $splitter->config->parseHeadings, 'Splitter has incorrect parseHeadings config');
$this->assertSame( $expectedOutput, $splitter->split( $input ) );
}
// TODO: Add more examples.
public static function provideSplitData() {
return [
[
true, // skipTemplates
false, // parseHeadings
'this may be linked [[this may not be linked]]',
[ 'this may be linked ', '[[this may not be linked]]', '' ]
],
[
true, // skipTemplates
false, // parseHeadings
'this may be linked <gallery>this may not be linked</gallery>',
[ 'this may be linked ', '<gallery>this may not be linked</gallery>', '' ]
],
[
true, // skipTemplates
false, // parseHeadings
'With skipTemplates = true, this may be linked {{mytemplate|param=link target}}',
[ 'With skipTemplates = true, this may be linked ', '{{mytemplate|param=link target}}', '' ]
],
[
false, // skipTemplates
false, // parseHeadings
'With skipTemplates = false, this may be linked {{mytemplate|param=link target}}',
[ 'With skipTemplates = false, this may be linked ', '{{mytemplate|param=', 'link target}}' ]
],
[
true, // skipTemplates
false, // parseHeadings
'With skipTemplates = true, this may be linked {{mytemplate|param={{transcluded}}}}',
[ 'With skipTemplates = true, this may be linked ', '{{mytemplate|param={{transcluded}}}}', '' ]
],
[
true, // skipTemplates
true, // parseHeadings
"With parseHeadings = true,\n==a heading may be linked==\n",
[ "With parseHeadings = true,\n==a heading may be linked==\n" ]
],
[
true, // skipTemplates
false, // parseHeadings
// no trailing newline in the following string because it would be swallowed
"With parseHeadings = false,\n==a heading may not be linked==",
[ "With parseHeadings = false,\n", "==a heading may not be linked==", '' ]
],
// Improperly formatted headings cannot be dealt with appropriately for now
// [
// true, // skipTemplates
// false, // parseHeadings
// "With parseHeadings = false,\n==an improperly formatted heading may be linked=\n",
// [ "With parseHeadings = false,\n==an improperly formatted heading may be linked=\n" ]
// ],
];
}
}

View File

@ -0,0 +1,61 @@
<?php
/**
* Copyright 2012-2017 Daniel Kraus <bovender@bovender.de> ('bovender')
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*
* @author Daniel Kraus <bovender@bovender.de>
*/
/**
* @group bovender
*/
class TargetTest extends MediaWikiTestCase {
/**
* @dataProvider provideStartOnly
*/
public function testTargetWordStartOnly( $enabled, $delimiter ) {
$config = new LinkTitles\Config();
$config->wordStartOnly = $enabled;
$target = new LinKTitles\Target( NS_MAIN, 'test page', $config );
$this->assertSame( $delimiter, $target->wordStart );
}
public static function provideStartOnly() {
return [
[ true, '(?<!\pL)' ],
[ false, '' ]
];
}
/**
* @dataProvider provideEndOnly
*/
public function testTargetWordEndOnly( $enabled, $delimiter ) {
$config = new LinkTitles\Config();
$config->wordEndOnly = $enabled;
$target = new LinKTitles\Target( NS_MAIN, 'test page', $config );
$this->assertSame( $delimiter, $target->wordEnd );
}
public static function provideEndOnly() {
return [
[ true, '(?!\pL)' ],
[ false, '' ]
];
}
}

View File

@ -0,0 +1,47 @@
<?php
/**
* Copyright 2012-2017 Daniel Kraus <bovender@bovender.de> ('bovender')
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*
* @author Daniel Kraus <bovender@bovender.de>
*/
/**
* Tests the LinkTitles\Targets class.
*
* @group bovender
* @group Database
*/
class TargetsTest extends LinkTitles\TestCase {
/**
* This test asserts that the list of potential link targets is 0
* @return [type] [description]
*/
public function testTargets() {
$title = \Title::newFromText( 'link target' );
$targets = LinkTitles\Targets::singleton( $title, new LinkTitles\Config() );
// Count number of articles: Inspired by updateArticleCount.php maintenance
// script: https://doc.wikimedia.org/mediawiki-core/master/php/updateArticleCount_8php_source.html
$dbr = wfGetDB( DB_SLAVE );
$counter = new SiteStatsInit( $dbr );
$count = $counter->pages();
$this->assertEquals( $targets->queryResult->numRows(), $count );
}
}

View File

@ -0,0 +1,43 @@
<?php
/**
* Copyright 2012-2017 Daniel Kraus <bovender@bovender.de> ('bovender')
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
* MA 02110-1301, USA.
*
* @author Daniel Kraus <bovender@bovender.de>
*/
namespace LinkTitles;
abstract class TestCase extends \MediaWikiTestCase {
protected function setUp() {
parent::setUp();
}
protected function tearDown() {
parent::tearDown();
}
public function addDBDataOnce() {
parent::addDBDataOnce();
$this->insertPage( 'link target', 'This page serves as a link target' );
Targets::invalidate(); // force re-querying the pages table
}
protected function getPageText( \WikiPage $page ) {
$content = $page->getContent();
return $page->getContentHandler()->serializeContent( $content );
}
}