NoMachine refuses to re-attach to physical session

Forum / NoMachine for Linux / NoMachine refuses to re-attach to physical session

Viewing 15 posts - 1 through 15 (of 23 total)
  • Author
    Posts
  • #18253
    fermulator
    Participant

    Been having this problem for a while, not always able to re-attach to my physical session.

    Running the free version

    my setup:

    • dual monitor, GDM+gnome3 (+system details below)

    my typical workflows:

    • A)
      • (CLEAN) fresh boot
      • login physically, use system, lock screen
      • NX remote in (xN times), use, lock
      • + a series of physical+remote access
    • B)
      • (UNDESIRED) fresh boot from remote
      • login via NX (unsure if this creates a “physical” session, or “virtual” session …), use system, lock (xN times)..
      • login physically, (it usually resumes my previous session), etc

     

    Problem Statement

    • I _know_ I have an active session going on (and I think it’s a physical one)
    • I know it’s running because I run VirtualBox as a userland session, and my VM is fully accessible
    • When I connect remotely via NX, it gives me a VIRTUAL session, and refuses to re-establish with my virtual! grrr
      •  SEE [0_nx_initial_virutal.png]
    • even if I LOGOUT of that virtual session fully, and retry, there’s a sequence of “no session, then oh here’s the session :1001 physical go to that
      • SEE [1_nx_retry_sequence.zip]

     

    Relevant Logs

    * see attached nxserver.log

    some logs I felt are useful, was monitoring the NX server log grepping for virtual/physical

    relevant SNIPPETS

    2018-04-26 09:31:25 450.938 17935 NXSERVER Server capacity: virtual/connections: 0/0
    2018-04-26 09:31:25 451.001 17935 NXSERVER User capacity: virtual/connections: 0/0


    2018-04-26 09:31:25 451.326 17935 NXSERVER User <snip_username> is desktop owner, access to physical desktop allowed.

    2018-04-26 09:31:25 451.602 17935 NXSERVER __checkFilterShadowable [yes][physicalDesktop]
    2018-04-26 09:31:25 451.661 17935 NXSERVER __checkFilterShadowable [yes][physicalDesktop] [ok]
    2018-04-26 09:31:25 451.717 17935 NXSERVER __checkFilterReconnectable [not set][][physicalDesktop][Connected] [ok]

    2018-04-26 09:31:25 452.973 17935 NXSERVER Translated result is: [physical-desktop]

    2018-04-26 09:31:29 457.239 17935 NXSERVER Session type ‘physicalAttachDesktop’ main_status ‘Connected’ attached ‘0’.
    2018-04-26 09:31:29 457.401 17935 NXSERVER Show monitor for physical and virtual session

     

    then here comes the rub! from the images, we KNOW my session is at :1001 … but this log says it’s starting up on :1002 🙁

    2018-04-26 09:31:29 514.405 17935 NXSERVER __node_reply return: ‘NX> 700 Session id: <snip_username>-lnx-1-1002-2B7061C38DB2B62583B32A15024BDFC2\nNX> 705 Session display: 1002\nNX> 701 Proxy cookie: ******\nNX>

    —-

    system details

    $ uname -a
    Linux <HOST>-1 4.15.15-200.fc26.x86_64 #1 SMP Mon Apr 2 16:25:08 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
    $ cat /etc/fedora-release
    Fedora release 26 (Twenty Six)
    $ rpm -qa | egrep -i “nx|nomachine”
    nx-libs-3.5.0.33-4.fc26.x86_64
    nomachine-6.0.66-2.x86_64
    inxi-2.3.56-1.fc26.noarch
    remmina-plugins-nx-1.2.0-0.50.20180321.git.f467f19.fc26.x86_64
    nxproxy-3.5.0.33-4.fc26.x86_64

    #18259
    fermulator
    Participant

    (trying to re-upload what failed first post)

    #18282
    Mth
    Contributor

    Hello

    The problem seems to be that NoMachine cannot detect running physical session and its starting
    its own virtual session. If it is nomachine-6.0.66-2.x86_64 package on Fedora 26 the most probable
    cause is Wayland. As stated here:

    https://forums.nomachine.com/topic/support-for-wayland-desktops-available-in-nomachine-6-1-6

    https://www.nomachine.com/AR02P00969

    NoMachine only recently released a version that supports Wayland, so if that’s the case an update
    is advised.

    You can also follow the tips here on how to disable Wayland:

    https://www.nomachine.com/FR10N03221

    If this is not the case, we will need further information to properly debug this problem:

    – What system/desktop environment processes are running for the logged user/login window screen.
    – nxserver.log from nxserver startup. We would need only logs from nxserver –daemon process, so no logging required.

    /Mth

    #18370
    fermulator
    Participant

    I was not aware of the Wayland new support w/ NoMachine as of 6.1.x. However, my system is not using Wayland ever, I disabled it at the beginning.

    $ grep Wayland /etc/gdm/custom.conf
    WaylandEnable=false

    The system is running gnome-shell (gnome3), and GDM

    $ rpm -qa | grep gnome-shell
    gnome-shell-3.24.3-2.fc26.x86_64
    $ rpm -qa | grep gdm
    gdm-3.24.3-1.fc26.x86_64

    Since this persistent issue, I recently (a few days ago) was physically at the workstation and performed a full system restart + physical login. Today attempted to remotely connect via NoMachine. The client still refuses to connect. This time in a different way … the client spins for minutes “Connecting to <host>…”.

    Logs sent to forum[at]nomachine[dot]com.

    #18371
    fermulator
    Participant

    NOTE: i also tried to restart nxserver

    $ sudo /usr/NX/bin/nxserver –restart
    NX> 162 Disabled service: nxserver.
    NX> 162 Disabled service: nxnode.
    NX> 162 Disabled service: nxd.
    NX> 161 Enabled service: nxserver.
    NX> 161 Enabled service: nxnode.
    NX> 161 Enabled service: nxd.

    logs got further but still not connecting

    #18420
    Mth
    Contributor

    Hello

    This is what I see in logs:

    1. nxserver –daemon process is started, probably after system restart.
    2. It tries to detect any running Xserver, but the command ‘netstat -ln –protocol=unix’
    does not list any proper sockets.
    3. Our virtual Xserver is started automatically and no more attempts to find Xservers are made.
    4. Two login attempts are made, but no attach.
    5. nxserver –restart is invoked and new server –daemon starts.
    This time server detects two running Xservers on displays :0 and :1.

    On display :0 we have running ‘gnome-session-binary –autostart /usr/share/gdm/greeter/autostart’ – so it is Login Window
    On display :1 ‘gnome-session-binary’ – it is users desktop.

    When queried about sessions, operating system answers that session on display 0 is inactive and on display 1 is active.

    6. The attach is being made to a proper session, but it seems to hang and is terminated after around 90 seconds.

    So there are two problems:

    First the virtual session that is being created after system startup. Probable culprit is the order of starting services in system. NoMachine is started before Xserver is finished, and it is causing problems. This can be mitigated by disabling automatic session creation in the server.cfg file. Please set the key ‘CreateDisplay’ to 0, and nxserver will wait for the Xserver to run
    or it will ask during player connection if such session should be started.

    The second is the hanging during attach. There is unfortunately no clue in the logs provided, so we will require additional data.
    We would need client side logs from the logging attempt when player is hanging. Please refer to the point about client logs from this article https://www.nomachine.com/DT07M00098 and send them to us like previously.

    /Mth

    #18441
    fermulator
    Participant

    Sounds like NoMachine has some bugs to figure out 🙂

     

    Just retried:

    once again, couldn’t connect, based on your theory, ran restart service again

    sudo /usr/NX/bin/nxserver –restart

    then:

    (see attached) it prompts to connect to existing session, kk YES — but then no desktop sessions 🙁

    logs again sent to NoMachine forums

    #18445
    Mth
    Contributor

    Hello.

    Regarding the logs you sent us, there is a new behavior different to the previous problems: nxexec process when starting node for desktop owner crashes. There are a few things to check:

    1. In logs there are two instances: nxexec with pid 21223 and 21316. Is there any sign of nxexec or nxnode processes crashes in system logs?

    2. Is there any indication of problem in our nxerror.log file. This file is located in ‘/usr/NX/var/log’ directory by default and is important part of logging. You can also send us this file for investigation. Also please check if SessionLogLevel in node.cfg file is also set to 7 as I cannot see any node logs and am not sure if the process is started or crashes on runtime.

    3. Please check the access rights to ‘/usr/NX/bin/nxexec’ file. they should match:

    Access: (4555/-r-sr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)

    4. Please check if nxnode works properly. You can check it with

    /etc/NX/nxnode –version

    If there are any errors please send us the logs again.

    /Mth

    #18735
    fermulator
    Participant

    Retried, extra logs/info. (logs send to email)

    You’re right, after attempts to attach to the physical display, the nx pids die out. After logging out of the dumb virtual display, no pids are running:

    BEFORE: (initial state)

    [ ~]$ ps wauxxx | grep nxexec
    root       871  0.0  0.0 145424   140 ?        S<   May21   0:00 /usr/NX/bin/nxexec –node –user USER –priority realtime –mode 0 –pid 21
    [ ~]$ ps wauxxx | grep nxnode
    USER+   894  0.0  0.1 1903060 38824 ?       S<l  May21   2:44 /usr/NX/bin/nxnode.bin

    DURING FIRST ATTEMPT

    $ ps wauxxx | grep nxexec
    root      8072  0.0  0.0 145424  8028 ?        S<   09:12   0:00 /usr/NX/bin/nxexec –node –user USER –priority realtime –mode 0 –pid 25
    root      9295  0.0  0.0 139164  8048 ?        S<   09:12   0:00 /usr/NX/bin/nxexec –node –user USER –priority realtime –mode 0 –pid 16 -H 5

    $ ps wauxxx | grep nxnode
    USER+  8098 22.3  0.7 3300988 193736 ?      S<l  09:12   0:09 /usr/NX/bin/nxnode.bin
    USER+  9347  1.8  0.3 2261336 88748 ?       S<l  09:12   0:00 /usr/NX/bin/nxnode.bin -H 5

    # THEN, after logging out of virtual, nothing 🙁

    (no more nxexec nor nxnode active pids)

     

    I saw these errors in error.log

    Info: Handler started with pid 6970 on Mon Jun 18 09:11:33 2018.
    Info: Handling connection from 192.168.130.4 port 41330 on Mon Jun 18 09:11:33 2018.
    Info: Connection from 192.168.130.4 port 41330 closed on Mon Jun 18 09:11:38 2018.
    Info: Handler with pid 6970 terminated on Mon Jun 18 09:11:38 2018.
    Info: Handler started with pid 7920 on Mon Jun 18 09:12:36 2018.
    Info: Handling connection from 192.168.130.4 port 41342 on Mon Jun 18 09:12:36 2018.
    /bin/cp: cannot stat ‘/usr/NX/share/config/knotifyrc.esd’: No such file or directory
    Info: Connection from 192.168.130.4 port 41342 closed on Mon Jun 18 09:12:51 2018.
    Info: Handler with pid 7920 terminated on Mon Jun 18 09:12:51 2018.

     

     

    #18736
    fermulator
    Participant

    also,

    $ sudo ls -al /usr/NX/bin/nxexec
    -r-sr-xr-x 1 root root 106944 Nov 27  2017 /usr/NX/bin/nxexec
    $ /etc/NX/nxnode –version
    NoMachine Node – Version 6.0.66

     

    #18766
    Mth
    Contributor

    Hello.

    Unfortunately there are still more questions and no answers there in logs.

    From nxerror.log most notable entries are:

    free(): invalid pointer
    nxnode.bin: malloc.c:4030: _int_malloc: Assertion `(unsigned long) (size) >= (unsigned long) (nb)’ failed.

    so it looks like the nxnode process is crashing.

    It seems to be something caused by operating system specific reason. Maybe you can think of some
    specific configuration there that can interfere with starting processes? Maybe something in pam.d
    configuration or user-specific limits? We are using pam.d/nx configuration file that base on su.
    Maybe there is something in pam.d/su that can cause problems, so if it is modified, please send us
    what changes were made and which modules were added. Also there could be a hint in system log file
    /var/log/auth.log – please look for ‘pam_unix(nx:session)’ entries.

    Going back to nxnode process crash. There should be a hint for this in /var/log/syslog file or similar.
    If operating system is configured to leave core files after crash we would be grateful for a backtrace
    from all threads. If not, You can try configuring nxnode to run with valgrind to help us find the problem.
    Please refer to our article at https://www.nomachine.com/AR09L00809 on how to set up NX to work
    with Valgrind.

    /Mth

    #18799
    fermulator
    Participant

    nx probably shouldn’t crash if it can’t utilize ‘su’ ;o

    –> Of note, this system uses sssd for authentication. (the physical display we’re trying to connect to is initiated by a REMOTE user to MS active directory) , AND indeed, these remote users are ONLY allowed to run “sudo” and never “sudo su” or “su” for security audit reasons.

    $ sudo ls > /dev/null
    $ sudo su
    This account is currently not available.

    $ sudo su –
    This account is currently not available.
    $ su –

    Password:
    su: Authentication failure

    (as you can see, user accounts are NOT allowed to become root for any reason)

    What exactly is nx trying to switch user to/for?

    —-

    here’s some dump of information …

    —-

    pam.d/nx

    $ cat /etc/pam.d/nx
    # This is a default PAM configuration for NoMachine. It is based on
    # system’s ‘su’ configuration and can be adjusted freely according
    # to administrative needs on the system.

    auth       include       su
    account    include       su
    password   include       su
    session    include       su

    $ cat /etc/pam.d/su
    #%PAM-1.0
    auth        sufficient    pam_rootok.so
    # Uncomment the following line to implicitly trust users in the “wheel” group.
    #auth        sufficient    pam_wheel.so trust use_uid
    # Uncomment the following line to require a user to be in the “wheel” group.
    #auth        required    pam_wheel.so use_uid
    auth        substack    system-auth
    auth        include        postlogin
    account        sufficient    pam_succeed_if.so uid = 0 use_uid quiet
    account        include        system-auth
    password    include        system-auth
    session        include        system-auth
    session        include        postlogin
    session        optional    pam_xauth.so

    there appear to be no nx:session entries in the audit logs .. (this is a CentOS based system fwiw, Fedora27, so “auth.log” isn’t a thing – that’s only on debian (+Ubuntu) systems

    $ for audit_log in $(sudo ls /var/log/audit); do sudo zgrep -ei /var/log/audit/$audit_log | egrep “nx\:session”; done
    $ for audit_log in $(sudo ls /var/log/audit); do sudo zgrep -ei /var/log/audit/$audit_log | egrep “nx:session”; done
    natta

     

     

    #18886
    fermulator
    Participant

    (’tis been about 2 weeks since my last reply, any next steps or is there sufficient information now for NoMachine to work on a fix?)

    #18902
    Mth
    Contributor

    Hello

    Sorry for the late answer…
    Yes indeed it should be working just with that, there have to be something more.

    So first things, there are two different problems there:

    1. NX cannot make physical session available.
    2. NX fails to run desktop on demand (virtual session).

    we need to handle both of those problems separately.

    Problem one I’ve only seen once in the logs, then every other attempt was only NX trying to
    start desktop on demand, so to reproduce this, please do the following:

    1. Please make sure the key ‘CreateDisplay’ in server.cfg file is set to 0 and uncommented.
    2. Please make sure the key ‘SessionLogLevel’ in both server.cfg and node.cfg is set to 7 and uncommented
    3. Do

    # /etc/NX/nxserver –restart

    4. When connecting with player and asked if it should create new desktop, please select “No”.
    5. If the session list is empty, please wait a moment in case it takes longer to run.
    6. Please send us the logs in /usr/NX/var/log/ directory. Preferably all of them.
    7. Please check the system logs in the case the nxnode process is crashing, we would like to
    receive what is logged or backtrace if present.

    I hope this will give us some hint on both problems, so we would tackle the starting of virtual session next.

    /Mth

    #18945
    fermulator
    Participant

    Well this is awkward …

    $ egrep -i “CreateDisplay|SessionLogLevel” /usr/NX/etc/server.cfg
    SessionLogLevel 7
    CreateDisplay 0
    # When ‘CreateDisplay’ is enabled, specify the display owner and let
    # When ‘CreateDisplay’ is enabled, specify the resolution of the new
    CreateDisplay 1
    #WebSessionLogLevel 6

    How are there TWO entries in there …

    I see first what looks to be the default section

    #
    # Enable or disable the automatic creation of an X11 display when no
    # X servers are running on this host (e.g. headless machine) to let
    # users connect to the desktop. This setting applies to NoMachine
    # servers not supporting virtual desktops and permits to have one
    # single display.
    #
    # 1: Enabled. NoMachine will create automatically the new display at
    #    server startup. This setting has to be used in conjunction with
    #    ‘DisplayOwner’ and ‘DisplayGeometry’.
    #
    # 0: Disabled. NoMachine will prompt the user for creating the new
    #    display. This is the default.
    #
    CreateDisplay 0

    and indeed, it’s set to zero

     

    then immediately following …

    #
    # When ‘CreateDisplay’ is enabled, specify the display owner and let
    # NoMachine create the new display without querying the user. If the
    # server supports only one concurrent connection, the connecting user
    # must be the display owner set in this key.
    #
    #DisplayOwner “”
    DisplayOwner mcallaghan

    #
    # When ‘CreateDisplay’ is enabled, specify the resolution of the new
    CreateDisplay 1
    # desktop in the WxH format. Default is 800×600.
    #
    #DisplayGeometry 800×600
    DisplayGeometry 800×600

    —-

    I do not recall having manually changed any of this, though it’s possible in some previous NoMachine debug session someone from support had me try that …. I will remove that and retest next week.

Viewing 15 posts - 1 through 15 (of 23 total)

This topic was marked as solved, you can't post.