• mis poll get 'stuck'

    From Biite@21:3/120 to All on Thu Jul 22 23:34:29 2021
    Hi all,

    Got the weirdest thing here:
    - Mystic 1.12 A46 running on Ubuntu 20.04.2
    - Experienced a filesystem full on the /mystic filesystem
    - Now sometimes a 'mis poll' gets stuck:
    mbbs@svr1:/mystic/semaphore$ ps fax|grep mis
    182272 ? SLsl 0:57 /mystic/mis daemon
    183902 ? S 0:00 \_/bin/sh -c ./mis poll 21:3/100 1> /dev/null
    /dev/null
    183903 ? RLl 1201:23 | \_ ./mis poll 21:3/100

    When killing the 'mis poll' (e.g. kill 183903) the whole process runs again.

    Any ideas?

    Martien

    --- Mystic BBS v1.12 A46 2020/08/26 (Linux/64)
    * Origin: Westland BBS (21:3/120)
  • From Zip@21:1/202 to Biite on Fri Jul 23 10:11:21 2021
    Hello Biite!

    On 22 Jul 2021, Biite said the following...
    - Experienced a filesystem full on the /mystic filesystem
    - Now sometimes a 'mis poll' gets stuck:

    Really strange (shouldn't be affected by the earlier disk full problem, one would think)...

    I had problems with mis poll hanging (and consuming lots of CPU) previously; when tracing the process it was looping around in gettimeofday() I think it was.

    You might want to use the timeout command to safeguard against hangs, e.g.:

    timeout -k 300 --preserve-status -v 300 ./mis poll 21:3/100

    ...which would allow mis to run for 5 minutes, then sends a TERM signal, waits up to 5 more minutes for it to finish, then sends a KILL signal.

    Best regards
    Zip

    --- Mystic BBS v1.12 A47 2021/07/18 (Linux/64)
    * Origin: Star Collision BBS, Uppsala, Sweden (21:1/202)
  • From Biite@21:3/120 to Zip on Fri Jul 23 23:34:06 2021
    Hi Zip,

    You might want to use the timeout command to safeguard against hangs,

    Set up the timeout command as you suggested and will keep monitoring the
    logs.
    Thanks for the suggestions!

    Regards,
    Martien

    --- Mystic BBS v1.12 A46 2020/08/26 (Linux/64)
    * Origin: Westland BBS (21:3/120)
  • From Zip@21:1/202 to Biite on Sat Jul 24 08:06:52 2021
    Hello Biite!

    On 23 Jul 2021, Biite said the following...
    Set up the timeout command as you suggested and will keep monitoring the logs.
    Thanks for the suggestions!

    You're very welcome!

    Best regards
    Zip

    --- Mystic BBS v1.12 A47 2021/07/18 (Linux/64)
    * Origin: Star Collision BBS, Uppsala, Sweden (21:1/202)
  • From Biite@21:3/120 to Zip on Sun Jul 25 23:01:54 2021
    On 24 Jul 2021, Zip said the following...

    You're very welcome!

    Looks like your timeout command did the trick, haven't had a 'hang' in a few days!

    Regards,
    Biite

    --- Mystic BBS v1.12 A46 2020/08/26 (Linux/64)
    * Origin: Westland BBS (21:3/120)
  • From Zip@21:1/202 to Biite on Sun Jul 25 23:33:00 2021
    Hello Biite!

    On 25 Jul 2021, Biite said the following...
    Looks like your timeout command did the trick, haven't had a 'hang' in a few days!

    Glad to hear that! :)

    Still wondering what could be the cause of the hangs -- the strace output showed that it was waiting for some kind of monotonic timer event if I recall correctly...

    Thinking if it could be that time went backwards or something (but that's usually not the way time adjustments are made, I think, except *maybe* during boot when syncing with the hardware clock, and if so, hopefully *before* all services are started)...

    Best regards
    Zip

    --- Mystic BBS v1.12 A47 2021/07/23 (Linux/64)
    * Origin: Star Collision BBS, Uppsala, Sweden (21:1/202)
  • From Biite@21:3/120 to Zip on Mon Jul 26 17:35:27 2021
    On 25 Jul 2021, Zip said the following...

    Still wondering what could be the cause of the hangs

    Same here, haven't checked here with strace (not really familiar on how to do that ;) )

    Thinking if it could be that time went backwards or something

    My mis poll got stuck sometime after a logrotate at midnight. This logrotate stops and starts the mis server. It got stuck a few hours (about 3) after
    that.
    And it suddenly started to get stuck, after I've got my filesystem full. Did not upgrade versions or anything else. Running 1.12 A46 for more than a year without issues before this started.

    Maybe I can check if I've the same issue if you send some explanation on how
    to 'strace' mis :)

    Regards,
    Biite

    --- Mystic BBS v1.12 A46 2020/08/26 (Linux/64)
    * Origin: Westland BBS (21:3/120)
  • From Zip@21:1/202 to Biite on Wed Sep 15 23:21:08 2021
    Hello Biite!

    I got this first now -- or my Mystic's message pointers are acting up. =)

    Anyway...

    On 26 Jul 2021, Biite said the following...
    Same here, haven't checked here with strace (not really familiar on how
    to do that ;) )

    To run mis poll manually from the command line and trace what it's doing, you could use something like:

    strace -f -s32768 -vvv ./mis poll 2>&1 | tee /tmp/trace.txt

    ...and then break with Ctrl+C and look at /tmp/trace.txt.

    But it's not a good "solution" for intermittent errors.

    As for logrotate and stopping/starting Mystic, I use the "copytruncate" logrotate option instead, so that the "rotated" files are copies of the current logs and the current logs are truncated (and Mystic continues to write to them, so no need for stopping/starting Mystic).

    Something like:

    /mystic/logs/*.log {
    daily
    notifempty
    missingok
    rotate 7
    compress
    delaycompress
    copytruncate
    su bbsuser bbsgroup
    }

    Best regards
    Zip

    --- Mystic BBS v1.12 A47 2021/09/07 (Linux/64)
    * Origin: Star Collision BBS, Uppsala, Sweden (21:1/202)