MediaWiki

From Bibliotheca Anonoma

The Bibliotheca Anonoma Wiki is configured quite uniquely for our needs. It uses an Nginx web server, PostgreSQL DB on an SSD, and images on a RAID for hosting, and has a load of interesting extensions.

  • Load Balancer DNS - Cloudflare
  • Web Server - Nginx
  • PHP Engine - PHP FPM
  • Database - PostgreSQL
  • Cache - memcached

To Do

  • Enable Anonymous Hash IDs like Tanasinn.info
  • Enable Cloudflare
  • Upgrade to Mediawiki 1.28
  • Enable PostgreSQL UNIX Socket on PHP-FPM for higher performance
  • Fix Logo for multiple sizes (displays weird on some computers, maybe make it match the Mediawiki default dimensions?)
  • Activate img.bibanon.org for finer tuned image management and serving
    • Prevent hotlinking on images from external sites to preserve bandwidth
  • Activate the best job queue system you can
  • Activate Memcached

Extensions

Infrastructure

  • Cloudflare - Using cloudflare may require some special mods to Mediawiki or just the web server to get the actual client IPs transferred, which is crucial if you allow anonymous IP edit (so you ban the user, not the entire cloudflare server).
    • Apache mod_cloudflare - An extension used for transmitting the real ip using X-Forwarded-For.
    • Nginx Visitor IP Forward - You can use the X-Forwarded-For header to obtain the real IP address. Only needs the Nginx Real IP module, which is included in Debian.
  • Amazon AWS - The tools needed to support AWS S3 upload, if you are using it. If you use this you should probably bundle it with Amazon Cloudfront, their load balancing service.
  • MobileFrontend - A mobilefrontend just like the one on Wikipedia. Makes editing away from home much easier.

Mods

  • Anonymous IP Hash - Halcy developed a mod for MediaWiki on tanasinn that hashes ips of anonymous users much like on 4chan's /b/ or 2channel.

Spam

  • SpamBlacklist - Comes with Mediawiki by default, and we've enabled it. However, it blocks a lot of good 4chan sources (naturally), so we've set up a whitelist as well.

Media

  • EmbedVideo - This embeds uploaded videos using the browser's own HTML5 <video> tag for embedding content (requires MP4 or webm). You can even embed from YouTube or NicoNico.

Security

  • OATHAuth - Uses TOTP one time codes along with your password for two factor authentication, in case one of them is compromised. You can run TOTP through Authy or Google Authenticator using any smartphone (or even dumbphone if it has Java applets). Well maintained since it is used by the Wikimedia Foundation for admin accounts. (not to be confused with OAUTH)
    • Wikimedia Gerrit: 135618 - Wikimedia Phabricator - T67658 - In the stable releases, OATHAuth only supports MySQL at the moment. However, Reedy has added PostgreSQL tables, so you need to grab the latest version straight from the git.
    • Then, go to the page Special:Two-factor_authentication to activate TOTP. You can use an app such as Authy, Google Authenticator, Authomator (BB10), or any other TOTP app: perhaps even the hardware OnlyKey.
  • OAuth - You can use an OAuth system so that you can use your own wiki accounts as a single login system (rather than many), just like you would link Google or Facebook accounts with OAuth. In particular, Mediawiki has the ability to activate two factor authentication with the extension above. Requires Memcached.
    • This extension implements OAuth 1.0, which requires cryptography enabled on both ends. OAuth 2.0 doesn't require this, but it has tradeoffs as a result (though it can be overcome by restoring cryptographic plugins). Thus, it's not a question of which is better, but which would work for you. More details here.
    • While the extension currently has SQLite support, it doesn't have PostgreSQL support yet. But it's a simple matter of translating the syntax into the correct format, in this directory. Simple, if not easy. It might be possible to use the SQLite to PostgreSQL conversion script.

Widgets

Widgets are little bits of HTML which can be used as advanced templates.

  • SoundCloud - Allows us to embed SoundCloud music for playing,

Installation Instructions

The Bibliotheca Anonoma Wiki has a unique installation process. Generally, you follow the guides here, but mix them both together.

General

Follow this guide first, but then the distribution specific ones for further guidance and dependencies.

https://www.mediawiki.org/wiki/Manual:Installing_MediaWiki

Debian

For Debian, although there exists a Mediawiki package in jessie-backports, it installs Apache and MySQL, which is not what we use. But if you're fine with that, go ahead.

https://www.mediawiki.org/wiki/Manual:Running_MediaWiki_on_Debian_or_Ubuntu

Wiki Backup

Even the archivists must back themselves up periodically, especially on such a crucial wiki. But if we fall behind, you can also run the WikiTeam scripts to generate a full text and full image backup.

Text Backup

In the case of our wiki, database dumps are only done for internal use because they are specific to a certain version of Mediawiki, our unique extensions, and contain sensitive data such as password hashes. It may not even be helpful to our successors, since we use PostgreSQL and MySQL/MariaDB may be easier to set up with Mediawiki.

Instead, we provide XML dumps which are version independent and free to all, and are periodically uploaded to the Internet Archive. These can also be made by the general public via Special:Export, which is what the WikiTeam scripts do.

Use DumpBackup.php to create XML dumps on the server itself. Then 7zip them up.

These XML dumps can then be imported through these procedures.

Image Backup

Image backup can be easily done from our end, so we commit to doing so, that way you don't have to.

Use ImportImages.php to dump them to a folder. Then 7zip them up into the Wikiteam format along with the XML.

PostgreSQL

PostgreSQL is used as our database for a number of reasons, from stability to compatibility with other apps to support for JSONB values.

However, it is clearly not the most popular choice of database for Mediawiki, so we do make some workarounds to support this unique use case. These mods are noted below.

PostgreSQL with UNIX Sockets

As noted in the Mediawiki tutorial, you generally connect to the PostgreSQL over a TCP connection, and use md5 password authentication.

However, if the PostgreSQL database is on the same server, it's a better idea to dispense with the TCP overhead and connect to the UNIX socket directly.

First, make sure that you've set a password for the postgres superuser, so you can log into it without using peer authentication, which we are going to switch to md5.

$ sudo su # must become root to become postgres user first
# su postgres
$ psql
postgres=# \password
Enter new password:
Enter it again:
postgres=#

Next, we need to enable md5 authentication to the UNIX socket. On PostgreSQL 9.6 on Debian, edit the file /etc/postgresql/9.6/main/pg_hba.conf and change the following lines to match the below:

# "local" is for Unix domain socket connections only
local   all             all                                     md5
# IPv4 local connections:
host    all             all             127.0.0.1/32            md5
# IPv6 local connections:
host    all             all             ::1/128                 md5

On Debian, the PostgreSQL UNIX Socket is at /var/run/postgresql/.s.PGSQL.5432, so in LocalSettings.php set these following lines (make sure to comment out $wgDBPort, which is not needed)

## Database settings
$wgDBtype = "postgres";
$wgDBserver = "/var/run/postgresql/.s.PGSQL.5432"; # UNIX port path

# Postgres specific settings
#$wgDBport = "5432"; # disable this
Note: Obviously if the PostgreSQL database is on another server you have to use TCP and not UNIX sockets to communicate with it, so just expose the port on that server and put the IP in $wgDBServer in LocalSettings.php.

Activating Memcached

Memcached is an alternative to the default APCu PHP caching system, and is designed to significantly lighten the load of queries on the database. Also, the OAuth extension requires memcached.

https://www.mediawiki.org/wiki/Memcached#Setup

https://www.howtoforge.com/install-memcached-and-php5-memcached-module-on-debian-6.0-squeeze

Serving files using a specific images subdomain

Nginx can be optimized to make image serving more efficient, and block hotlinking. Since the settings for static images often differ greatly from that of dynamic text, it is recommended that you create a specific subdomain just for images (such as img.bibanon.org) and serve your image folder from there.

Here is the Nginx config we used (without SSL), with our image folder under a custom dir set by : /storage/mw-img/:

server {
    listen 80;
    server_name img.bibanon.org;

    # images stored here
    root /storage/mw-img/;

    # let's encrypt SSL dir
    location ~ /\.well-known {
        root /var/lib/letsencrypt;
    }

    location ^~ / {
        try_files $uri =404;
    }

    location ^~ /thumb/ {
        try_files $uri =404;     
    }

    # block unnecessary access
    location ^~ /lockdir/ { deny all; }
    location ^~ /temp/ { deny all; }
    location ^~ /archive/ { deny all; }

    # block image hotlinking, but not from search engines
    valid_referers none blocked bibanon.org *.bibanon.org ~.google. ~.bing. ~.yahoo.;
    if ($invalid_referer) {
        return   403; # you can alternatively link to an small unsavory picture to be a douche, though it still takes a little bandwidth
    }
}