SeaweedFS

From Bibliotheca Anonoma

Gradeschooler Explaination of SeaweedFS[edit]

If you can't explain it to a gradeschooler or lawmaker then your docs are shit and you're probably selling empty promises.

Seaweedfs stores data like files or things called 'objects' on many disks on a computers on a network, so that nothing important has to live in only one place. If it's set up right then is something breaks the data should still be available as if everything were fine; and the technicians can just toss replacements into place transparently to the user.

This means whatever program the user is running does not have to care as much about the inevitable hardware faults across the network.

The reliability comes from having more than one of anything important available and multiple copies of any 'Volumes' and 'Objects' stored in SeaweedFS (a concept called 'replication'), which can be configured from only a single copy for no fault tolerance to 9x9x9 copies for silly levels of fault tolerance.

The underlying storage is still files on disks like we normally use, but SeaweedFS uses big files that are faster to access than small files This is done by 'VolumeServers' and we call those big files 'Volumes'.

Inside the 'Volumes' are smaller units of data called 'Objects'. Large files are stored by splitting them up into many 'Objects'. If data is changes, a new 'Object' is created instead of modifying the existing 'Object', the reasoning behind this is a technical discussion in its own right.

Seaweedfs keeps track of where things are in what they call a 'Filer' which is the part that understands what a file is and has a database of what 'Objects' each file stored in SeaweedFS is made up of.

In order to access files stored in SeaweedFS using the traditional methods like normal files on other filesystems, 'weed mount' needs to be running to act as a link into the 'Cluster' of storage SeaweedFS is running on.

The different parts of the 'Cluster' are given orders, numbering information, and locations of other parts by the 'Master', which is comparatively simple because it only cares about a few administrative tasks instead of knowing what files are where.

SeaweedFS provides many methods to access the date stored within it, both as files and as 'Objects'.

Security is mainly done by (optional) preshared secret passwords and cryptographic certificates, access to weed should simply fail if the required certificate is not present. (Certificates are like a container for cryptographic keys and are a complex matter in their own right; we only care that they are secure and that we have matching certificates given to all the different things in the 'Cluster')

Encryption of data-at-rest is supported but comes with the normal caveats. i.e. if you screw something up you can make the encrypted data impossible to read. If you do not have your boss telling you you need to be using it, you are likely much better off avoiding encrypting the data being stored, as then recovery will at least probably be possible if you break something.


Normal overview of SeaweedFS[edit]

TODO

See the SeaweedFS Github repo and SeaweedFS Github wiki for official documentation

Links and References[edit]

SeaweedFS Releases page

SeaweedFS github wiki