(Why) would it be "bad practice" to separate CPU and storage to separate devices?

PlutoniumAcid@lemmy.world · edit-2 1 year ago

(Why) would it be "bad practice" to separate CPU and storage to separate devices?

Molecular0079@lemmy.world · 1 year ago

I wouldn’t recommend running container volumes over network shares mainly because network instability between NAS and server can cause some really weird issues. Imagine an application having its files ripped from underneath them while they’re running.

I would suggest containers + volumes together on the server, and stuff that’s just pure data on the NAS. So for example, if you were to run a Jellyfin media server, the docker container and its volumes will be on the server, but the video and audio files will be stored on the NAS and accessed via a network share mount.

PlutoniumAcid@lemmy.world · 1 year ago

I think you are saying what I am also saying, but my post was not clear on this:

The container files live on the server, and I use the volume section in my docker-compose.yml files to map data to the NFS share:

        volumes:
            - '/mnt/nasvolume/docker/picoshare/data:/data'

Would you say this is an okay approach?

Molecular0079@lemmy.world · edit-2 1 year ago

Mmm, not quite. I am not familiar with how picoshare works exactly, but according to the picoshare docker README, it uses the data volume to store its application sqlite database. My original suggestion is that the Docker application and its application data (configs, cache, local databases, and other files critical to the functioning of the application) should be on the same machine, but the images and other files that picoshare shares can be remote.

Basically, my rule is to never assume that anything hosted on another machine will be guaranteed to be available. If you think picoshare can still work properly when its sqlite database gets ripped out without warning, then by all means go for it. However, I don’t think this is the case here. You’ll risk the sqlite database getting corrupted or the application itself erroring out if there’s ever a network outage.

For example, with the Jellyfin docker image, I would say that the cache and config volumes have to be local, while media can be on a remote NAS. My reasoning is that Jellyfin is built to handle media files changing / adding / disappearing. It is however, not built to gracefully handle its config files and caches disappearing in the middle of operation.

Norah - She/They@lemmy.blahaj.zone · 1 year ago

There are also situations where you can do it safely if there’s already the ability for remote communication built in. I have some MariaDB containers on a different machine to what’s using and serving the data. I could’ve had them in the same Compose file on the one machine, communicating over an internal Docker network. Instead I just changed it to point at an external port instead.

Molecular0079@lemmy.world · 1 year ago

Agreed! If the application can handle these files (or other resources) disappearing for a while during network issues, then sure, they can be separate. However, if an application depends on a file for its core functionality, I do not think it is a good idea.

a_fancy_kiwi@lemmy.world · 1 year ago

Would it be bad practice?

No, it’s fine. Especially for people who self host. Use what you have available to you as best you can

Why would it be bad practice?

Depends on your use case. A gigabit connection and hard drives are fine for something like a personal media server or simple file storage but if you wanted to edit video or play games from the NAS, you might look into upgrading to SSDs and getting a faster connection to the PC

SoNick@readit.buzz · 1 year ago

@a_fancy_kiwi Exactly! In a business environment where you need to squeeze every possible penny and every second of downtime is money lost, OP is introducing additional potential points of failure.

In a homelab where downtime is just an inconvenience? Go for it! Try it for yourself and see how you like it!

@PlutoniumAcid

carzian@lemmy.ml · 1 year ago

What no one else has touched on is the protocol used for network drives interferes with databases. Protocols like SMB lock files during read/write so other clients on the network can’t corrupt the file by interacting with it at the same time.

It is bad practice to put the docker files on a NAS because it’s slower, and the protocol used can and will lead to docker issues.

That’s not to say that no files can be remote, jellyfin’s media library obviously supports connecting to network drives, but the docker volume and other config files need to be on the local machine.

Data centers get around this by:

running actual separate databases with load balancing
using cluster storage like ceph and VMs that can be moved across hypervisors
a lot more stuff that’s very complicated

My advice is to buy a new SSD and clone the existing one over. They’re dirt cheap and you’re going to save yourself a lot of headache.

marcos@lemmy.world · 1 year ago

Data centers get around this by:

Using network mapped disks instead of network mapped filesystems.

They use SAN and not NAS. The database and VM architecture do not fundamentally change the behavior of the disks, and there isn’t much more complicated stuff beyond that.

mea_rah@lemmy.world · 1 year ago

In context of self hosting it’s probably worth pointing out, that SQLite specifically mentions NFS on their How To Corrupt An SQLite Database File page.

SQLite is used in many popular services people run at home. Often as only or default option, because it does not require external service to work.

tburkhol@lemmy.world · edit-2 1 year ago

Thee main issues are latency and bandwidth, especially on a server with limited RAM. If you’re careful to manage just the data over NAS, it’s probably fine, especially if the application’s primary job is server data over the same network to clients. That will reduce your effective bandwidth, as data has to go NAS->server->client over the same wires/wifi. If the application does a lot of processing, like a database, you’ll start to compromise that function more noticeably.

On applications with low user count, on a home network, it’s probably fine, but if you’re a hosting company trying to maximize the the data served per CPU cycle, then having the CPU wait on the network is lost money. Those orgs will often have a second, super-fast network to connect processing nodes to storage nodes, and that architecture is not too hard to implement at home. Get a second network card for your server, and plug the NAS into it to avoid dual-transmission over the same wires. [ed: forgot you said you have that second network]

The real issue is having application code itself on NAS. Anytime the server has to page apps in or out of memory, you impose millisecond-scale network latency on top of microsecond-scale SSD latency, and you put 1 Gb/s network cap on top of 3-6 Gb/s SSD bandwidth. If you’re running a lot of containers in small RAM, there can be a lot of memory paging, and introducing millisecond delays every time the CPU switches context will be really noticeable.

Scott@lem.free.as · 1 year ago

Rather than NFS, perhaps iSCSI would be a better fit.

billwashere@lemmy.world · 1 year ago

Well it’s not “bad practice” per se but it ultimately depends on what you are trying to accomplish and what the underlying architecture is capable of supporting.

Not docker specific but we run enterprise level applications with VMWare esxi hosts accessing vm’s over an iSCSI network share and plan on trying this using NFS datastores over 100gbe with SCM and E1.L SSDs. So this should work fine but this is super fast super expensive hardware and is a lot different from homelab type architecture. Now I have done similar things at home with two esxi hosts using a Drobo for NFS datastores so I could vmotion. Was it super high performance? No. Did it work? Yes.

For a small container based environment I’d probably keep all the containers and storage for the containers local, probably on a fast SSD or NVME drive. Those are usually small. But any large media I’d NFS mount and keep that on the NAS. You can get 1TB NVME drive and pcie adapter for less than $100 on Amazon that would be way fast enough for anything in a home lab.

My 2¢… 😀

JesterRaiin@lemmy.world · 1 year ago

With each “gate” between the devices, the performance, security and stability suffers additional setbacks and limitations. In your case, you introduced a few new “gates”:

the connection between NAS, router and your PC (Wifi/cable)
router
network stability

Of course it’s theory, since the actual performance depends on details, he amount and type of data, how you access it and what you’re doing with/to it.

lordnikon@lemmy.world · 1 year ago

doesn’t Synology NAS support SAN volumes out of the box? Then you just setup iscsi on the Linux system.

MrMonkey@lemm.ee · 1 year ago

I use NFS with caches (cachefilesd) on a local SSD. Works great.

dbaines 🇦🇺@lemmy.world · 1 year ago

I use this approach myself but am starting to reconsider.

I have an Asus PN51 (NUC-like) minipc as my server / brains, hosting all my dockers etc. All docker containers and their volumes are locally sorted on the device. I have a networked QNAP NAS for storage for things like Plex / jellyfin.

It’s mostly ok but the NAS is noticeably slower to start up than the NUC, which has caused issues after power loss where Plex thinks all the movies are gone so it empties out the library, then when the NAS comes back up it reindexes and throws off all the dates for everything. It also empties out tags (collections) and things like radarr and sonarr will start fetching things it thinks don’t exist anymore. I’ve stopped those problematic services from starting on boot to hopefully fix those issues. I’ve also added a UPS to avoid minor power outs.

celebdor@lemmy.sdf.org · 1 year ago

Might be worth it to have the systemd service that runs the docker container (if you run it like that) have a ExecStartPre= statement that checks for availability of the NAS.

tburkhol@lemmy.world · 1 year ago

You might be able to solve some of these issues by changing the systemd service descriptions. Change/add an After keyword to make sure the network storage is fully mounted before trying to start the actual service.

https://www.golinuxcloud.com/start-systemd-service-after-nfs-mount/