Jump to content

[41.78] [Multiplayer] Zomboid Dedicated Server Does Not Handle SIGTERM


Tripodzomboid

Recommended Posts

• Version?

• Singleplayer/Multiplayer?

• Host or dedicated?

• Mods?

• Old or new save?

   • Reproduction steps:

 

• 41.78

• Multiplayer

• Dedicated

• No Mods

• N/A

• Reproduction steps:

Start a Zomboid server on Linux, and send SIGTERM(Signal 15) to it's PID, expected behavior should be a graceful exit (should be equivalent to "quit"), but instead the server will terminate ungracefully and not properly save.

 

See: https://dsa.cs.tsinghua.edu.cn/oj/static/unix_signal.html

Edited by Tripodzomboid
grace to graceful
Link to comment
Share on other sites

I'm not well versed in Linux, so could you explain this a little more?

Just from looking at the manual, I'm not sure what the practical difference would be for the game? The JVM will already create its own core dump if its exited abnormally, so having it sent to console output doesn't seen necessary. We also don't use any additional shutdown hooks (which seems to be what'd cause SIGTERM  in java) and instead use System.exit().

Link to comment
Share on other sites

19 hours ago, EnigmaGrey said:

I'm not well versed in Linux, so could you explain this a little more?

Just from looking at the manual, I'm not sure what the practical difference would be for the game? The JVM will already create its own core dump if its exited abnormally, so having it sent to console output doesn't seen necessary. We also don't use any additional shutdown hooks (which seems to be what'd cause SIGTERM  in java) and instead use System.exit().

 

SIGTERM is usually understood to be a friendly request for a program to exit, which is why it is allowed to be caught and ignored. Catching SIGTERM and nicely exiting will allow the Zomboid server to be managed correctly by SystemD (init system), which sends SIGTERMS to manage daemons and processes, as many linux server owners might want to manage their server as a daemon (how many servers are managed, like http, ssh, sftp etc...), and prevent data loss for example if a sysadmin reboots their system without properly saving their Zomboid server, as systemd will send SIGTERM to processes to ask them to close for a reboot (a stop job), if for some reason the server can't exit correctly systemD will eventually send SIGKILL (unfriendly).

 

Perhaps this is already the functionality though and I am doing something wrong? Does system.exit() properly save the world file? I believe this issue caused me a world rollback as I manage my Zomboid server as a daemon, which the wiki recommends.

 

 

Edited by Tripodzomboid
Clarification, and reason for the bug report, and proper reply.
Link to comment
Share on other sites

Okay, I think I get what you mean: the pz server isn’t designed to be a  background service. It instead depends on console input to function and will not operate gracefully without it.

 

You need to enter the save and then quit commands in the console  to ensure it shuts down correctly.

 

 This is something we could look at changing in the future now that we’re aware of it, but I really must caution against running the server this way. I feel the wiki may be giving troublesome advice in this case.

Link to comment
Share on other sites

2 hours ago, EnigmaGrey said:

Okay, I think I get what you mean: the pz server isn’t designed to be a  background service. It instead depends on console input to function and will not operate gracefully without it.

 

You need to enter the save and then quit commands in the console  to ensure it shuts down correctly.

 

 This is something we could look at changing in the future now that we’re aware of it, but I really must caution against running the server this way. I feel the wiki may be giving troublesome advice in this case.

Running a server this way makes remote administration significantly easier, as the only other way to handle a remote server and do other tasks on the same machine during the same remote session is to use a terminal multiplexer like GNU Screen or tmux. but even if you disagree with this usecase handling SIGTERM will provide some valuable safeties against other SIGTERM scenarios such as an accidental system reboot.

Edited by Tripodzomboid
Spelling and Clarification
Link to comment
Share on other sites

tl;dr: No, I agree. I think it'd be best for us to add signal handling in the future and one of the devs we've just hired is heavily involved with Linux. He'll be looking into it. It'll not be something that we can do until Build 42, though. Until then the PZ server needs to be treated the way it's designed: as a console application instead of a service.

 

Long version:

 

It's really not that I disagree or that I don't think we should do it.

 

It's that I didn't know this was a "thing" with Java until this conversation. It's just something at least I, personally, have never encountered before (outside of a bit of C++). It's in our interest to support this however, as it explains a few issues we've encountered over the years when supporting users (that is, some likely tried exactly this -- some that offer paid hosting -- without realizing that not all applications are able to handle this without expressly being designed for it).

So as the program is now, it has to be shut down with the console commands. From looking at the Wiki, I believe this is also what you discovered, hence the addition of zombie.command?

If you can work it out so that you first send save, wait 10-15 seconds, then send quit, you should not risk data loss until we can sort this.

Link to comment
Share on other sites

  • 8 months later...

I've found a few ways around this, but I made an account to share my favorite, since it took me a fair bit of time to learn how to put this stuff together.

 

I'm running a zomboid server in a docker container that I built myself, another learning exercise, https://github.com/brenno263/zomboid_server_container

It's not perfect, and certainly isn't documented enough for public use, so let me know if that's something that anyone wants and I can write a nice guide and some better options.

 

Anyways, the main point of this is to share how I handled SIGTERM. Well, here's my docker entry script:

#!/bin/bash

ARGS="-Xmx8G -Xms4G -- -servername da-hood -cachedir=$DATA_DIR"

if [ -n "$ADMIN_PASSWORD" ]; then
	ARGS="$ARGS -adminpassword $ADMIN_PASSWORD"
fi

echo "Using args: $ARGS"

# Set up our fifo pipe to push commands into the running server.
ZOMBOID_STDIN_PIPE="zomboid_stdin_pipe"
mkfifo "$ZOMBOID_STDIN_PIPE"

# Fix to a bug in start-server.sh that prevents zomboid from correctly preloading a library:
# ERROR: ld.so: object 'libjsig.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
export LD_LIBRARY_PATH="${APP_DIR}/jre64/lib:${LD_LIBRARY_PATH}"

# Start the server, taking stdin from our pipe so that we can talk to it later.
# We use the read/write redirect "<>" since it doesn't block while the pipe is empty.
#	(the server should not be stopped to wait for a piped command. "<" would have this effect.)
# Notice that we don't redirect server outputs, so they still land in the terminal.
bash $APP_DIR/start-server.sh $ARGS 0<> "$ZOMBOID_STDIN_PIPE" &
SERVER_PID=$!
echo "THE PID OF THE SERVER IS $SERVER_PID!"

# Set QUIT to 1 when we get a SIGTERM or SIGINT, breaking the following sleep loop.
QUIT=0
trap "QUIT=1" TERM INT

# Loop until we've decided to quit or the server is somehow not running.
while [ "$QUIT" -eq "0" ] && kill -0 "$SERVER_PID" >& /dev/null ; do
	sleep 10;
	echo "sleeping"
	# If there's any input hanging out on stdin, go ahead and push it through the pipe.
	# This helps us maintain interactivity when running this script.
	if read -t 0 ; then
		echo "found stdin, writing to pipe."
		while read -r -t 0.5 line; do
			echo "writing $line"
			echo "$line" > "$ZOMBOID_STDIN_PIPE"
		done
	fi
done

# Once we've broken the sleep loop, send the quit command to the server and wait for it to stop.
echo "quitting"
echo "quit" > "$ZOMBOID_STDIN_PIPE"

echo "waiting"
wait # waits for child processes to complete

rm "$ZOMBOID_STDIN_PIPE"
echo "done"

 

With clever use of a named pipe, I'm able to catch any signals, send a command to the server, and wait for it to close. Of course, this solution is for a Docker container, where you have one entry script that gets SIGTERM'd when the container wants to close. It could be adapted without too much trouble for a SystemD setup, though I am abusing Docker's ephemeral filesystem by spitting out a named pipe without a unique name and expecting that not to matter.

 

If you are using SystemD however, there's an even better way to achieve this. As seen in this Nixos module for a minecraft server: https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/services/games/minecraft-server.nix, around line 184 they configure a SystemD-managed socket wired into the stdin of the process. Then in the exit script on line 26, they are able to echo the quit command into that socket, stopping the server gracefully. This approach could absolutely be used for zomboid as well.

 

Edit to add: I forgot that the PZWiki reccomends exactly this approach with SystemD lmao. Would have been way simpler to just link to their guide. https://pzwiki.net/wiki/Dedicated_server#System.d

 

 

Anyways, I know this is a lot for anyone who isn't deeply invested in linux-land. Feel free to ask me any questions :)

Edited by brenno263
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...