Editing FoolFuuka/Asagi

From Bibliotheca Anonoma

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 9: Line 9:
The Asagi Imageboard Archival Standard was developed by eksopl of Easymodo and the Foolz team under the direction of woxxy. It was developed to run the Foolz archiver, and has been the engine for the majority of archivers since the collapse of Archive.moe.
The Asagi Imageboard Archival Standard was developed by eksopl of Easymodo and the Foolz team under the direction of woxxy. It was developed to run the Foolz archiver, and has been the engine for the majority of archivers since the collapse of Archive.moe.


== SQL Schema ==
Three versions can be identified:


Two versions of the SQL Schema in use can be identified:
* Mark I (2009) - Produced for Foolz.us. Maybe in use by Nyafuu, was in use by Loveisover.
 
* Mark II (2015) - Produced for Archive.moe. Used by Fireden and arch.b4k.co.
* 1.0.0 (2013) - The final version by Eksopl for Foolz.us, and the SQL schema was unchanged in Archive.moe.
* Mark III (2019) - The final reference standard codified by the Bibliotheca Anonoma, in preparation for the development of new drop-in replacements.
** Should be used by Fireden, but this is unknown.
** Might be still used by Nyafuu, was in use by Loveisover.
* 1.3.0 (2019) - (only Mysql/triggers.sql was changed, no other structural SQL schema changes) The final reference standard consolidated from 4plebs repos by the Bibliotheca Anonoma for use in Desuarchive, in preparation for the development of new drop-in replacements.
** Used by Desuarchive, Rbt, arch.b4k.co, maybe 4plebs?
 
A refined version 2.0.0 is proposed that would eliminate SQL triggers for improved performance, instead leaving it up to the scraper engines to conduct similar operations as the triggers.


== Reference Implementation ==
== Reference Implementation ==
Line 27: Line 21:
** Despite the fact that PostgreSQL is supported by Asagi Scraper, it is not supported by FoolFuuka.
** Despite the fact that PostgreSQL is supported by Asagi Scraper, it is not supported by FoolFuuka.
* Scraper: Asagi (Java) - https://github.com/bibanon/asagi
* Scraper: Asagi (Java) - https://github.com/bibanon/asagi
* Frontend: FoolFuuka (PHP) - https://github.com/bibanon/FoolFuuka
* Frontend: FoolFuuka (PHP) - https://github.com/pleebe/FoolFuuka/tree/experimental
* PHP Engine: Historically HHVM, PHP5.x compatible. Desuarchive and 4plebs now uses PHP 7.
* PHP Engine: Historically HHVM, PHP5.x compatible. Desuarchive and 4plebs now uses PHP 7.
* Search: Sphinxsearch
* Search: Sphinxsearch
Line 54: Line 48:
Also check [[FoolFuuka/Install/Ubuntu16|FoolFuuka/Install/Ubuntu16]]
Also check [[FoolFuuka/Install/Ubuntu16|FoolFuuka/Install/Ubuntu16]]


= How Asagi does stuff =
= How asagi does stuff =


== Configuration ==
== Configuration ==
Line 117: Line 111:
AbstractDumper.java ln. 95: <code>public void initDumper(BoardSettings boardSettings) {</code>
AbstractDumper.java ln. 95: <code>public void initDumper(BoardSettings boardSettings) {</code>


== How Asagi decides to update a thread: ==
== How asagi decides to update a thread: ==


In <code>initDumper()</code>, <code>DumperJSON</code> spawns an instance of its inner class <code>BoardPoller</code> on a thread. In its <code>run()</code> method, <code>BoardPoller</code> loops indefinitely:
?dunno?


* Wake up from sleeping (duration set by <code>refreshDelay</code> in the configuration)
== When asagi does a thread update: ==
* <code>threadList = sourceBoard.getAllThreads(lastMod);</code>
* If the request 304s or errors, go to sleep
* Go over the previous threads:
** If this thread is in the current threads and it’s been modified, mark its modification timestamp and page number. Then, push it to <code>newTopics</code>.
** If this thread is not in the current threads, it’s been deleted. Push it to <code>newTopics</code>.
* Put the remaining threads in newTopics
* Sleep until the delay expires
 
The queue <code>newTopics</code> is processed by <code>AbstractDumper</code>’s inner class <code>TopicFetcher</code>.
 
== When Asagi does a thread update: ==


in : YotsubaJSON.java, ln. 88: <code>public Topic getThread(int threadNum, String lastMod) throws ContentGetException, ContentParseException, CfBicClearParseException {</code>
in : YotsubaJSON.java, ln. 88: <code>public Topic getThread(int threadNum, String lastMod) throws ContentGetException, ContentParseException, CfBicClearParseException {</code>
Line 138: Line 121:
Loads thread JSON Decodes JSON For each post in the decoded thread JSON: Check if resto value is zero, and if so create a new thread from that post, updating lastmodified time to the time from fetching the JSON. <code>t = this.makeThreadFromJson(pj);</code> If resto is zero: Add the post to the current thread. <code>t.addPost(this.makePostFromJson(pj));</code> (What if two posts were resto==0? We’d break!)
Loads thread JSON Decodes JSON For each post in the decoded thread JSON: Check if resto value is zero, and if so create a new thread from that post, updating lastmodified time to the time from fetching the JSON. <code>t = this.makeThreadFromJson(pj);</code> If resto is zero: Add the post to the current thread. <code>t.addPost(this.makePostFromJson(pj));</code> (What if two posts were resto==0? We’d break!)


== What does Asagi do with a post in a thread? ==
== What does asagi do with a post in a thread? ==


Relevant files: &gt;YousubaJSON.java &gt;YotsubaAbstract.java
Relevant files: &gt;YousubaJSON.java &gt;YotsubaAbstract.java
Line 188: Line 171:
Sanitized? EXIF data YotsubaJSON.java (ln. 212): <code>p.setExif(this.cleanSimple(this.parseMeta(pj.getCom(), pj.getUniqueIps(), pj.getSince4pass(), pj.getTrollCountry())));</code> Return the Post() object.
Sanitized? EXIF data YotsubaJSON.java (ln. 212): <code>p.setExif(this.cleanSimple(this.parseMeta(pj.getCom(), pj.getUniqueIps(), pj.getSince4pass(), pj.getTrollCountry())));</code> Return the Post() object.


== How Asagi handles an image in a thread? ==
== How asagi handles an image in a thread? ==


Files of note: &gt;Local.java - Saving image files
Files of note: &gt;Local.java - Saving image files
Line 194: Line 177:
Local.java ln.201: <code>public void insertMedia(MediaPost h, Board source, boolean isPreview) throws ContentGetException, ContentStoreException, CfBicClearParseException {</code>
Local.java ln.201: <code>public void insertMedia(MediaPost h, Board source, boolean isPreview) throws ContentGetException, ContentStoreException, CfBicClearParseException {</code>


== How Asagi deals with post deletions? ==
== How asagi deals with post deletions? ==


Relevant files: &gt;src.java &gt;src.java &gt;src.java
Relevant files: &gt;src.java &gt;src.java &gt;src.java
Line 385: Line 368:
== Images table values: ==
== Images table values: ==


These seem to be handled by triggers that run on post insert.
These seem to be handled by triggers that run on post insert. ### 4ch -&gt; (asagi) -&gt; DB
 
=== 4ch -&gt; (Asagi) -&gt; DB ===


=== N/A -&gt; (Incremental integer) -&gt; “media_id” ===
=== N/A -&gt; (Incremental integer) -&gt; “media_id” ===
Line 409: Line 390:
The relative path to the image on disk. (Triggers.sql ln.119-139): <code>TODO</code>
The relative path to the image on disk. (Triggers.sql ln.119-139): <code>TODO</code>


=== N/A -&gt; (incrementer) -&gt; “total” ===
=== N/A -&gt; (incerementer) -&gt; “total” ===


The number of posts that refer to this row.
The number of posts that refer to this row.
Line 455: Line 436:
(AbstractDumper.java ln 161): - Fullsize media downloader thread <code>protected class MediaFetcher implements Runnable {</code> (AbstractDumper.java lln. 169): - Grab from queue thing <code>mediaPost = mediaUpdates.take();</code> (AbstractDumper.java lln. 173): - Try to handle the media for one post <code>mediaLocalBoard.insertMedia(mediaPost, sourceBoard);</code> (Local.java ln. 201): - Handler for a post with media <code>public void insertMedia(MediaPost h, Board source, boolean isPreview) throws ContentGetException, ContentStoreException, CfBicClearParseException {</code> (Local.java ln. 201): - Interact with DB for this media <code>mediaRow = db.getMedia(h);</code>
(AbstractDumper.java ln 161): - Fullsize media downloader thread <code>protected class MediaFetcher implements Runnable {</code> (AbstractDumper.java lln. 169): - Grab from queue thing <code>mediaPost = mediaUpdates.take();</code> (AbstractDumper.java lln. 173): - Try to handle the media for one post <code>mediaLocalBoard.insertMedia(mediaPost, sourceBoard);</code> (Local.java ln. 201): - Handler for a post with media <code>public void insertMedia(MediaPost h, Board source, boolean isPreview) throws ContentGetException, ContentStoreException, CfBicClearParseException {</code> (Local.java ln. 201): - Interact with DB for this media <code>mediaRow = db.getMedia(h);</code>


If there is information for this media in the DB, retrieve it. If any new information exists about this media that is not already in the DB, add that to the DB entry. (SQL.java ln. 289): - Interact with DB for this media <code>public synchronized Media getMedia(MediaPost post) throws ContentGetException, ContentStoreException, DBConnectionException {</code>
If there is information for this media in the DB, retreive it. If any new information exists about this media that is not already in the DB, add that to the DB entry. (SQL.java ln. 289): - Interact with DB for this media <code>public synchronized Media getMedia(MediaPost post) throws ContentGetException, ContentStoreException, DBConnectionException {</code>


(SQL.java ln. 342-347): - Decide if media row needs an update
(SQL.java ln. 342-347): - Decide if media row needs an update
Line 516: Line 497:
(Ln. 22) <code>public Topic(int num, int omPosts, int omImages) {</code>
(Ln. 22) <code>public Topic(int num, int omPosts, int omImages) {</code>


Data flow: 4ch -&gt; Asagi -&gt; DB
Data flow: 4ch -&gt; asagi -&gt; DB


?Seems to be handled by DB triggers.?
?Seems to be handled by DB triggers.?
Please note that all contributions to Bibliotheca Anonoma are considered to be released under the Creative Commons Attribution-ShareAlike (see Bibliotheca Anonoma:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!
Cancel Editing help (opens in new window)