Editing FoolFuuka/Asagi

From Bibliotheca Anonoma

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 3: Line 3:
<blockquote>Note: Not to be confused with the FoolFuuka frontend, which uses the Asagi scraper.
<blockquote>Note: Not to be confused with the FoolFuuka frontend, which uses the Asagi scraper.
</blockquote>
</blockquote>
https://github.com/eksopl/fuuka/wiki/Sphinx-Search-Backend#gory-details
https://github.com/eksopl/fuuka/wiki/Sphinx-Search-Backend#gory-details


Line 9: Line 10:
The Asagi Imageboard Archival Standard was developed by eksopl of Easymodo and the Foolz team under the direction of woxxy. It was developed to run the Foolz archiver, and has been the engine for the majority of archivers since the collapse of Archive.moe.
The Asagi Imageboard Archival Standard was developed by eksopl of Easymodo and the Foolz team under the direction of woxxy. It was developed to run the Foolz archiver, and has been the engine for the majority of archivers since the collapse of Archive.moe.


== SQL Schema ==
Three versions can be identified:
 
Two versions of the SQL Schema in use can be identified:
 
* 1.0.0 (2013) - The final version by Eksopl for Foolz.us, and the SQL schema was unchanged in Archive.moe.
** Should be used by Fireden, but this is unknown.
** Might be still used by Nyafuu, was in use by Loveisover.
* 1.3.0 (2019) - (only Mysql/triggers.sql was changed, no other structural SQL schema changes) The final reference standard consolidated from 4plebs repos by the Bibliotheca Anonoma for use in Desuarchive, in preparation for the development of new drop-in replacements.
** Used by Desuarchive, Rbt, arch.b4k.co, maybe 4plebs?


A refined version 2.0.0 is proposed that would eliminate SQL triggers for improved performance, instead leaving it up to the scraper engines to conduct similar operations as the triggers.
* Mark I (2009) - Produced for Foolz.us. Maybe in use by Nyafuu, was in use by Loveisover.
* Mark II (2015) - Produced for Archive.moe. Used by Fireden and arch.b4k.co.
* Mark III (2019) - The final reference standard codified by the Bibliotheca Anonoma, in preparation for the development of new drop-in replacements.


== Reference Implementation ==
== Reference Implementation ==
Line 27: Line 22:
** Despite the fact that PostgreSQL is supported by Asagi Scraper, it is not supported by FoolFuuka.
** Despite the fact that PostgreSQL is supported by Asagi Scraper, it is not supported by FoolFuuka.
* Scraper: Asagi (Java) - https://github.com/bibanon/asagi
* Scraper: Asagi (Java) - https://github.com/bibanon/asagi
* Frontend: FoolFuuka (PHP) - https://github.com/bibanon/FoolFuuka
* Frontend: FoolFuuka (PHP) - https://github.com/pleebe/FoolFuuka/tree/experimental
* PHP Engine: Historically HHVM, PHP5.x compatible. Desuarchive and 4plebs now uses PHP 7.
* PHP Engine: Historically HHVM, PHP5.x compatible. Desuarchive and 4plebs now uses PHP 7.
* Search: Sphinxsearch
* Search: Sphinxsearch
Line 46: Line 41:
https://github.com/eksopl/asagi/wiki/Running-Asagi
https://github.com/eksopl/asagi/wiki/Running-Asagi


Also check [[FoolFuuka/Install/Ubuntu16#Install_and_compile_Asagi_from_source.|FoolFuuka/Install/Ubuntu16#Install_and_compile_Asagi_from_source.]]
Also check [[FoolFuuka/Install/Ubuntu16#Install_and_compile_Asagi_from_source.]]


=== FoolFuuka ===
=== FoolFuuka ===
Line 52: Line 47:
https://blog.foolz.us/
https://blog.foolz.us/


Also check [[FoolFuuka/Install/Ubuntu16|FoolFuuka/Install/Ubuntu16]]
Also check [[FoolFuuka/Install/Ubuntu16]]
 
= How Asagi does stuff =
 
== Configuration ==
 
For example here is the config for Desuarchive:
 
<pre>{&quot;settings&quot;: {
  &quot;dumperEngine&quot;: &quot;DumperJSON&quot;,
  &quot;sourceEngine&quot;: &quot;YotsubaJSON&quot;,


  &quot;boardSettings&quot;: {
= How asagi does stuff =
    &quot;default&quot;: {
      &quot;engine&quot;: &quot;Mysql&quot;,
      &quot;database&quot;: &quot;asagi&quot;,
      &quot;host&quot;: &quot;localhost&quot;,
      &quot;username&quot;: &quot;asagi&quot;,
      &quot;password&quot;: &quot;YOUR_PASSWORD_HERE,
      &quot;charset&quot;: &quot;utf8mb4&quot;,
      &quot;path&quot;: &quot;/srv/foolfuuka/boards&quot;,
      &quot;updateFileLastModified&quot;: false,
      &quot;useOldDirectoryStructure&quot;: false,
      &quot;webserverGroup&quot;: &quot;www-data&quot;,
      &quot;thumbThreads&quot;: 2,
      &quot;mediaThreads&quot;: 2,
      &quot;newThreadsThreads&quot;: 6,
      &quot;deletedThreadsThresholdPage&quot;: 8,
      &quot;refreshDelay&quot;: 60,
      &quot;throttleAPI&quot;: false,
      &quot;throttleURL&quot;: &quot;i.4cdn.org&quot;,
      &quot;throttleMillisec&quot;: 1050,
      &quot;threadRefreshRate&quot;: 50
    },


    &quot;mlp&quot;: {},
    &quot;qa&quot;: {},
    &quot;aco&quot;: {},
    &quot;tg&quot;: {},
    &quot;d&quot;: {},
    &quot;co&quot;: {},
    &quot;a&quot;: {},
    &quot;an&quot;: {},
    &quot;k&quot;: {},
    &quot;fit&quot;: {},
    &quot;wsg&quot;: {&quot;mediaThreads&quot;: 0},
    &quot;gif&quot;: {&quot;mediaThreads&quot;: 0},
    &quot;r9k&quot;: {},
    &quot;int&quot;: {},
    &quot;c&quot;: {},
    &quot;m&quot;: {},
    &quot;vr&quot;: {},
    &quot;his&quot;: {},
    &quot;trash&quot;: {},
    &quot;cgl&quot;: {},
    &quot;g&quot;: {},
    &quot;mu&quot;: {}
  }
}}</pre>
== Misc ==
== Misc ==


Line 117: Line 57:
AbstractDumper.java ln. 95: <code>public void initDumper(BoardSettings boardSettings) {</code>
AbstractDumper.java ln. 95: <code>public void initDumper(BoardSettings boardSettings) {</code>


== How Asagi decides to update a thread: ==
== How asagi decides to update a thread: ==


In <code>initDumper()</code>, <code>DumperJSON</code> spawns an instance of its inner class <code>BoardPoller</code> on a thread. In its <code>run()</code> method, <code>BoardPoller</code> loops indefinitely:
?dunno?


* Wake up from sleeping (duration set by <code>refreshDelay</code> in the configuration)
== When asagi does a thread update: ==
* <code>threadList = sourceBoard.getAllThreads(lastMod);</code>
* If the request 304s or errors, go to sleep
* Go over the previous threads:
** If this thread is in the current threads and it’s been modified, mark its modification timestamp and page number. Then, push it to <code>newTopics</code>.
** If this thread is not in the current threads, it’s been deleted. Push it to <code>newTopics</code>.
* Put the remaining threads in newTopics
* Sleep until the delay expires
 
The queue <code>newTopics</code> is processed by <code>AbstractDumper</code>’s inner class <code>TopicFetcher</code>.
 
== When Asagi does a thread update: ==


in : YotsubaJSON.java, ln. 88: <code>public Topic getThread(int threadNum, String lastMod) throws ContentGetException, ContentParseException, CfBicClearParseException {</code>
in : YotsubaJSON.java, ln. 88: <code>public Topic getThread(int threadNum, String lastMod) throws ContentGetException, ContentParseException, CfBicClearParseException {</code>
Line 138: Line 67:
Loads thread JSON Decodes JSON For each post in the decoded thread JSON: Check if resto value is zero, and if so create a new thread from that post, updating lastmodified time to the time from fetching the JSON. <code>t = this.makeThreadFromJson(pj);</code> If resto is zero: Add the post to the current thread. <code>t.addPost(this.makePostFromJson(pj));</code> (What if two posts were resto==0? We’d break!)
Loads thread JSON Decodes JSON For each post in the decoded thread JSON: Check if resto value is zero, and if so create a new thread from that post, updating lastmodified time to the time from fetching the JSON. <code>t = this.makeThreadFromJson(pj);</code> If resto is zero: Add the post to the current thread. <code>t.addPost(this.makePostFromJson(pj));</code> (What if two posts were resto==0? We’d break!)


== What does Asagi do with a post in a thread? ==
== What does asagi do with a post in a thread? ==


Relevant files: &gt;YousubaJSON.java &gt;YotsubaAbstract.java
Relevant files: &gt;YousubaJSON.java &gt;YotsubaAbstract.java
Line 188: Line 117:
Sanitized? EXIF data YotsubaJSON.java (ln. 212): <code>p.setExif(this.cleanSimple(this.parseMeta(pj.getCom(), pj.getUniqueIps(), pj.getSince4pass(), pj.getTrollCountry())));</code> Return the Post() object.
Sanitized? EXIF data YotsubaJSON.java (ln. 212): <code>p.setExif(this.cleanSimple(this.parseMeta(pj.getCom(), pj.getUniqueIps(), pj.getSince4pass(), pj.getTrollCountry())));</code> Return the Post() object.


== How Asagi handles an image in a thread? ==
== How asagi handles an image in a thread? ==


Files of note: &gt;Local.java - Saving image files
Files of note: &gt;Local.java - Saving image files
Line 194: Line 123:
Local.java ln.201: <code>public void insertMedia(MediaPost h, Board source, boolean isPreview) throws ContentGetException, ContentStoreException, CfBicClearParseException {</code>
Local.java ln.201: <code>public void insertMedia(MediaPost h, Board source, boolean isPreview) throws ContentGetException, ContentStoreException, CfBicClearParseException {</code>


== How Asagi deals with post deletions? ==
== How asagi deals with post deletions? ==


Relevant files: &gt;src.java &gt;src.java &gt;src.java
Relevant files: &gt;src.java &gt;src.java &gt;src.java
Line 263: Line 192:
Post ID number from 4chan. Passed through as-is.
Post ID number from 4chan. Passed through as-is.


==== “subnum” ====
==== “subnum“ ====


Ghostpost ID number for foolfuuka. Always zero.?
Ghostpost ID number for foolfuuka. Always zero.?
Line 321: Line 250:
==== N/A -&gt; “Deleted” ====
==== N/A -&gt; “Deleted” ====


Set initially as false, then later updated if post is later absent from thread during subsequent updates. (YoutsubaJSON.java, ln. 205): <code>p.setdeleted(false);</code>
Set initially as false, then later updated if post is later absent from thread during subsequent updates. (YoutsubaJSON.java, ln. 205):
 
==== “Capcode” -&gt; “Capcode” ====


(YoutsubaJSON.java, ln. 173):
<pre class="p.setdeleted(false);```">
#### “Capcode -&gt; &quot;Capcode&quot;
(YoutsubaJSON.java, ln. 173):</pre>
String capcode = pj.getCapcode(); if (capcode != null) { if (capcode.equals(“manager”) || capcode.equals(“Manager”)) { capcode = “G”; } else { capcode = capcode.substring(0, 1).toUpperCase(); } }” ```


<pre>String capcode = pj.getCapcode();
if (capcode != null) {
    if (capcode.equals(&quot;manager&quot;) || capcode.equals(&quot;Manager&quot;)) {
        capcode = &quot;G&quot;;
    } else {
        capcode = capcode.substring(0, 1).toUpperCase();
    }
}</pre>
==== “email” -&gt; “Email” ====
==== “email” -&gt; “Email” ====


Line 385: Line 307:
== Images table values: ==
== Images table values: ==


These seem to be handled by triggers that run on post insert.
These seem to be handled by triggers that run on post insert. ### 4ch -&gt; (asagi) -&gt; DB
 
=== 4ch -&gt; (Asagi) -&gt; DB ===


=== N/A -&gt; (Incremental integer) -&gt; “media_id” ===
=== N/A -&gt; (Incremental integer) -&gt; “media_id” ===
Line 399: Line 319:
=== N/A -&gt; (Local filepath to full image) -&gt; “media” ===
=== N/A -&gt; (Local filepath to full image) -&gt; “media” ===


The relative path to the image on disk. (Triggers.sql ln.119-139): <code>TODO</code>
The relative path to the image on disk. (Triggers.sql ln.119-139): <code>todo-codeblock</code>


=== N/A -&gt; (Local filepath to OP thumbnail) -&gt; “preview_op” ===
=== N/A -&gt; (Local filepath to OP thumbnail) -&gt; “preview_op” ===


The relative path to the image on disk. (Triggers.sql ln.119-139): <code>TODO</code>
The relative path to the image on disk. (Triggers.sql ln.119-139): <code>todo-codeblock</code>


=== N/A -&gt; (Local filepath to reply thumbnail) -&gt; “preview_reply” ===
=== N/A -&gt; (Local filepath to reply thumbnail) -&gt; “preview_reply” ===


The relative path to the image on disk. (Triggers.sql ln.119-139): <code>TODO</code>
The relative path to the image on disk. (Triggers.sql ln.119-139):
 
=== N/A -&gt; (incrementer) -&gt; “total” ===


<pre class="todo-codeblock```">
### N/A -&gt; (incerementer) -&gt; &quot;total&quot;
The number of posts that refer to this row.
The number of posts that refer to this row.
 
(Triggers.sql) ln.123):</pre>
(Triggers.sql ln.123): <code>INSERT INTO \&quot;%%BOARD%%\_images\&quot; (media\_hash, media, preview\_op, total)</code>
INSERT INTO &quot;%%BOARD%%_images&quot; (media_hash, media, preview_op, total)<code>(Triggers.sql ln.127):</code>total = (total + 1),```
 
(Triggers.sql ln.127): <code>total = (total + 1)</code>


=== N/A -&gt; (N/A) -&gt; “banned” ===
=== N/A -&gt; (N/A) -&gt; “banned” ===


Not set by Asagi, but observed to prevent downloading banned files (Triggers.sql ln.119-139): <code>TODO</code>
Not set by Asagi, but observed to prevent downloading banned files (Triggers.sql ln.119-139):
 
===== Table definition =====


(Boards.sql ln.38-49): - ''Table definition''
<pre class="todo-codeblock```">


##### Table definition
(Boards.sql ln.38-49): - _Table definition_</pre>
CREATE TABLE %%BOARD%%_images ( media_id SERIAL NOT NULL, media_hash character varying(25) NOT NULL, media character varying(20), preview_op character varying(20), preview_reply character varying(20), total integer NOT NULL DEFAULT ‘0’, banned smallint NOT NULL DEFAULT ‘0’, PRIMARY KEY (media_id), UNIQUE (media_hash) );
CREATE TABLE %%BOARD%%_images ( media_id SERIAL NOT NULL, media_hash character varying(25) NOT NULL, media character varying(20), preview_op character varying(20), preview_reply character varying(20), total integer NOT NULL DEFAULT ‘0’, banned smallint NOT NULL DEFAULT ‘0’, PRIMARY KEY (media_id), UNIQUE (media_hash) );


===== Image insert procedure =====
<pre>


(Triggers.sql ln.119-139): - Image insert procedure
##### Image insert procedure


(Triggers.sql ln.119-139): - Image insert procedure</pre>
DROP PROCEDURE IF EXISTS “insert_image_%%BOARD%%”; CREATE PROCEDURE “insert_image_%%BOARD%%” (n_media_hash VARCHAR(25), n_media VARCHAR(20), n_preview VARCHAR(20), n_op INT) BEGIN IF n_op = 1 THEN INSERT INTO &quot;%%BOARD%%_images&quot; (media_hash, media, preview_op, total) VALUES (n_media_hash, n_media, n_preview, 1) ON DUPLICATE KEY UPDATE media_id = LAST_INSERT_ID(media_id), total = (total + 1), preview_op = COALESCE(preview_op, VALUES(preview_op)), media = COALESCE(media, VALUES(media)); ELSE INSERT INTO &quot;%%BOARD%%_images&quot; (media_hash, media, preview_reply, total) VALUES (n_media_hash, n_media, n_preview, 1) ON DUPLICATE KEY UPDATE media_id = LAST_INSERT_ID(media_id), total = (total + 1), preview_reply = COALESCE(preview_reply, VALUES(preview_reply)), media = COALESCE(media, VALUES(media)); END IF; END; ```
DROP PROCEDURE IF EXISTS “insert_image_%%BOARD%%”; CREATE PROCEDURE “insert_image_%%BOARD%%” (n_media_hash VARCHAR(25), n_media VARCHAR(20), n_preview VARCHAR(20), n_op INT) BEGIN IF n_op = 1 THEN INSERT INTO &quot;%%BOARD%%_images&quot; (media_hash, media, preview_op, total) VALUES (n_media_hash, n_media, n_preview, 1) ON DUPLICATE KEY UPDATE media_id = LAST_INSERT_ID(media_id), total = (total + 1), preview_op = COALESCE(preview_op, VALUES(preview_op)), media = COALESCE(media, VALUES(media)); ELSE INSERT INTO &quot;%%BOARD%%_images&quot; (media_hash, media, preview_reply, total) VALUES (n_media_hash, n_media, n_preview, 1) ON DUPLICATE KEY UPDATE media_id = LAST_INSERT_ID(media_id), total = (total + 1), preview_reply = COALESCE(preview_reply, VALUES(preview_reply)), media = COALESCE(media, VALUES(media)); END IF; END; ```


Line 455: Line 374:
(AbstractDumper.java ln 161): - Fullsize media downloader thread <code>protected class MediaFetcher implements Runnable {</code> (AbstractDumper.java lln. 169): - Grab from queue thing <code>mediaPost = mediaUpdates.take();</code> (AbstractDumper.java lln. 173): - Try to handle the media for one post <code>mediaLocalBoard.insertMedia(mediaPost, sourceBoard);</code> (Local.java ln. 201): - Handler for a post with media <code>public void insertMedia(MediaPost h, Board source, boolean isPreview) throws ContentGetException, ContentStoreException, CfBicClearParseException {</code> (Local.java ln. 201): - Interact with DB for this media <code>mediaRow = db.getMedia(h);</code>
(AbstractDumper.java ln 161): - Fullsize media downloader thread <code>protected class MediaFetcher implements Runnable {</code> (AbstractDumper.java lln. 169): - Grab from queue thing <code>mediaPost = mediaUpdates.take();</code> (AbstractDumper.java lln. 173): - Try to handle the media for one post <code>mediaLocalBoard.insertMedia(mediaPost, sourceBoard);</code> (Local.java ln. 201): - Handler for a post with media <code>public void insertMedia(MediaPost h, Board source, boolean isPreview) throws ContentGetException, ContentStoreException, CfBicClearParseException {</code> (Local.java ln. 201): - Interact with DB for this media <code>mediaRow = db.getMedia(h);</code>


If there is information for this media in the DB, retrieve it. If any new information exists about this media that is not already in the DB, add that to the DB entry. (SQL.java ln. 289): - Interact with DB for this media <code>public synchronized Media getMedia(MediaPost post) throws ContentGetException, ContentStoreException, DBConnectionException {</code>
If there is information for this media in the DB, retreive it. If any new information exists about this media that is not already in the DB, add that to the DB entry. (SQL.java ln. 289): - Interact with DB for this media <code>public synchronized Media getMedia(MediaPost post) throws ContentGetException, ContentStoreException, DBConnectionException {</code>


(SQL.java ln. 342-347): - Decide if media row needs an update
(SQL.java ln. 342-347): - Decide if media row needs an update
Line 510: Line 429:
===== YoutsubaJSON.java: =====
===== YoutsubaJSON.java: =====


(ln.217) “private Topic makeThreadFromJson(PostJson pj) throws ContentParseException {
(ln.217) “private Topic makeThreadFromJson(PostJson pj) throws ContentParseException {


===== Topic.java =====
===== Topic.java =====
Line 516: Line 435:
(Ln. 22) <code>public Topic(int num, int omPosts, int omImages) {</code>
(Ln. 22) <code>public Topic(int num, int omPosts, int omImages) {</code>


Data flow: 4ch -&gt; Asagi -&gt; DB
Data flow: 4ch -&gt; asagi -&gt; DB


?Seems to be handled by DB triggers.?
?Seems to be handled by DB triggers.?
Please note that all contributions to Bibliotheca Anonoma are considered to be released under the Creative Commons Attribution-ShareAlike (see Bibliotheca Anonoma:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!
Cancel Editing help (opens in new window)