MediaWiki: Difference between revisions

From Bibliotheca Anonoma
No edit summary
Line 25: Line 25:
* [[PostgreSQL/MediaWiki|PostgreSQL]] - Used as our database for a number of reasons, from stability to compatibility with other apps to support for JSONB values. However, it is clearly not the most popular choice of database for Mediawiki, so we do make some workarounds to support this unique use case. Unfortunately, facts are that Mediawiki was made for MySQL/MariaDB first, so we decided to move over.  
* [[PostgreSQL/MediaWiki|PostgreSQL]] - Used as our database for a number of reasons, from stability to compatibility with other apps to support for JSONB values. However, it is clearly not the most popular choice of database for Mediawiki, so we do make some workarounds to support this unique use case. Unfortunately, facts are that Mediawiki was made for MySQL/MariaDB first, so we decided to move over.  
* [[Memcached/MediaWiki|Memcached]] - An alternative to the default APCu PHP caching system, and is designed to significantly lighten the load of queries on the database. Also, the [[mediawikiwiki:Extension:OAuth|OAuth]] extension requires memcached.
* [[Memcached/MediaWiki|Memcached]] - An alternative to the default APCu PHP caching system, and is designed to significantly lighten the load of queries on the database. Also, the [[mediawikiwiki:Extension:OAuth|OAuth]] extension requires memcached.
* [[Redis/MediaWiki|Redis]] - The all-important job queue is currently run with a cron job, but there are more effective ways of doing this.


== Extensions ==
== Extensions ==

Revision as of 04:39, 15 January 2017

The Bibliotheca Anonoma Wiki is configured quite uniquely for our needs. It uses an Nginx web server, PostgreSQL DB on an SSD, and images on a RAID for hosting, and has a load of interesting extensions.

  • Load Balancer DNS - Cloudflare
  • Caching Front Server - Varnish
  • Web Server - Nginx
  • PHP Engine - HHVM
  • Database - MariaDB
  • Lua Engine - LuaSandbox C module
  • Cache - memcached

Network Topology of the Bibliotheca Anonoma Wiki

Ordered in layers from front to back. Details on our implementation and instructions on how to replicate them are found in pages below.

  • Cloudflare - Cloudflare is a CDN, load balancer, and DDoS mitigation service: all for free.
    • Using cloudflare may require some special mods to Mediawiki or just the web server to get the actual client IPs transferred, which is crucial if you allow anonymous IP edit (so you ban the user, not the entire cloudflare server).
    • Apache mod_cloudflare - An extension used for transmitting the real ip using X-Forwarded-For.
    • Nginx Visitor IP Forward - You can use the X-Forwarded-For header to obtain the real IP address. Only needs the Nginx Real IP module, which is included in Debian.
  • Nginx (SSL Redirect and img.bibanon.org) - Nginx is used primarily as an SSL frontend to Varnish, since Varnish doesn't support SSL. It also serves static files from img.bibanon.org without the help of Varnish. We didn't bother to cache img.bibanon.org with Varnish, since with static files: Nginx and Varnish have similar performance, no need to add another layer.
  • Varnish - This caching front server is used on Wikimedia sites to significantly reduce the amount of regeneration that dynamic pages need, while preventing outdated caches by having Mediawiki directly tell Varnish what needs to be regenerated.
  • Nginx - Nginx also serves as a backend to Varnish on port 127.0.0.1:8080 (internal only), and proxies a PHP-FPM UNIX socket.
    • HHVM - Facebook's HipTop Virtual Machine significantly speeds up PHP code with just-in-time compilation. It's also what the Wikimedia Foundation uses.
    • PHP-FPM - Unlike Apache, Nginx isn't able to run PHP natively itself, so we use PHP-FPM here. It's a bit faster than normal PHP.
  • MediaWiki/Installation - How we install MediaWiki itself.
  • PostgreSQL - Used as our database for a number of reasons, from stability to compatibility with other apps to support for JSONB values. However, it is clearly not the most popular choice of database for Mediawiki, so we do make some workarounds to support this unique use case. Unfortunately, facts are that Mediawiki was made for MySQL/MariaDB first, so we decided to move over.
  • Memcached - An alternative to the default APCu PHP caching system, and is designed to significantly lighten the load of queries on the database. Also, the OAuth extension requires memcached.
  • Redis - The all-important job queue is currently run with a cron job, but there are more effective ways of doing this.

Extensions

Infrastructure

  • Amazon AWS - The tools needed to support AWS S3 upload, if you are using it. If you use this you should probably bundle it with Amazon Cloudfront, their load balancing service.
  • MobileFrontend - A mobilefrontend just like the one on Wikipedia. Makes editing away from home much easier.
  • Translate - Very powerful translation tool used on most Wikimedia wikis to great effect.
  • Cargo - Adds semantic metadata handling to MediaWiki, making it a very powerful semantic web database. Cargo also works as a simpler, better alternative to Semantic MediaWiki: because in practice metadata is stored only in infoboxes anyway.

Lua Modules

Lua modules are a powerful and efficient alternative to the increasingly incomprehensible MediaWiki templating language. Because if it's going to be programmed anyway, might as well use a real programming language.

  • Scribunto - Provides Lua scripting for Turing-complete computation instead of using increasingly complex template scripting. Might be a little intimidating to install, but it's well worth it.
  • Capiunto - Easy and effective infoboxes for anyone.

Mods

  • Anonymous IP Hash - Halcy developed a mod for MediaWiki on tanasinn that hashes ips of anonymous users much like on 4chan's /b/ or 2channel.

Spam

  • SpamBlacklist - Comes with Mediawiki by default, and we've enabled it. However, it blocks a lot of good 4chan sources (naturally), so we've set up a whitelist as well.

Media

  • EImage - Embed external images as if they were normal MediaWiki images.
  • EmbedVideo - This embeds uploaded videos using the browser's own HTML5 <video> tag for embedding content (requires MP4 or webm). You can even embed from YouTube or NicoNico.

Security

  • OATHAuth - Uses TOTP one time codes along with your password for two factor authentication, in case one of them is compromised. You can run TOTP through Authy or Google Authenticator using any smartphone (or even dumbphone if it has Java applets). Well maintained since it is used by the Wikimedia Foundation for admin accounts. (not to be confused with OAUTH)
    • Wikimedia Gerrit: 135618 - Wikimedia Phabricator - T67658 - In the stable releases, OATHAuth only supports MySQL at the moment. However, Reedy has added PostgreSQL tables, so you need to grab the latest version straight from the git.
    • Then, go to the page Special:Two-factor_authentication to activate TOTP. You can use an app such as Authy, Google Authenticator, Authomator (BB10), or any other TOTP app: perhaps even the hardware OnlyKey.
  • OAuth - You can use an OAuth system so that you can use your own wiki accounts as a single login system (rather than many), just like you would link Google or Facebook accounts with OAuth. In particular, Mediawiki has the ability to activate two factor authentication with the extension above. Requires Memcached.
    • This extension implements OAuth 1.0, which requires cryptography enabled on both ends. OAuth 2.0 doesn't require this, but it has tradeoffs as a result (though it can be overcome by restoring cryptographic plugins). Thus, it's not a question of which is better, but which would work for you. More details here.
    • While the extension currently has SQLite support, it doesn't have PostgreSQL support yet. But it's a simple matter of translating the syntax into the correct format, in this directory. Simple, if not easy. It might be possible to use the SQLite to PostgreSQL conversion script.

Widgets

Widgets are little bits of HTML which can be used as advanced templates.

  • SoundCloud - Allows us to embed SoundCloud music for playing,

Wiki Backup

Even the archivists must back themselves up periodically, especially on such a crucial wiki. But if we fall behind, you can also run the WikiTeam scripts to generate a full text and full image backup.

Text Backup

In the case of our wiki, database dumps are only done for internal use because they are specific to a certain version of Mediawiki, our unique extensions, and contain sensitive data such as password hashes. It may not even be helpful to our successors, since we use PostgreSQL and MySQL/MariaDB may be easier to set up with Mediawiki.

Instead, we provide XML dumps which are version independent and free to all, and are periodically uploaded to the Internet Archive. These can also be made by the general public via Special:Export, which is what the WikiTeam scripts do.

Use DumpBackup.php to create XML dumps on the server itself. Then 7zip them up.

These XML dumps can then be imported through these procedures.

Image Backup

Image backup can be easily done from our end, so we commit to doing so, that way you don't have to.

Use ImportImages.php to dump them to a folder. Then 7zip them up into the Wikiteam format along with the XML.