Saturday, February 11, 2012

Ubuntu Login problems : Getting ch$wned

After leaving my computer on for the night, I awoke to a frozen screen.   My laptop is power by Ubuntu 10.10 and has some problems with hibernate at times.  So I proceeded hard power down to reboot.  Only when it came up I got a blank screen.  Oh crap did my video card die?  Disk, are you corrupt?

So it it may be something screwed something up last night via my hadoop installation, but not sure what.  I also was suspicious it could be a disk-hw issue.  The following is a log of what I had to do to bring it back up.

1) First when installing ubuntu, you get a recovery mode installed that allows you bypass all the gnome stuff.  I used this to get to the recovery menu.

I picked "root login with network only" to get a bash shell
Next i typed in
$ ifconfig eth0 up
$ startx

It logged in into ubuntu desktop under my root account.  I ran
$touch somefile.txt 
Also df, mount, dmesg, to check if there were hw errors that cause the drive to mount read-only.  I found none.  At this time I could of ran fsck, but didn't look like a simple corruption so I decided to skip consistency checking and keep it in my back pocket


2) Now I tried to logon via my bash to see if it was the gdmsetup (the login window) or the login itself
$ login jag
No directory, logging in with HOME=/ 
Cannot execute /bin/dash: Permission denied

Ouch, so it can't seem to find my home directory I think, which is set by /etc/passwd.
Opening that file and I changed three different accounts and changed login shells, /bin/sh, /bin/dash and /bin/bash.  All accounts failed.
Now the clues I have, root login is happy and no user account can login.  Likely permissions.  So I check the /home/user folders, and perform a chown -R jag:jag /home/jag

3) Desperate, and 2 hours gone in mucking around, I searched and found this:
http://linuxgazette.net/issue52/okopnik.html

Though 10 years old, the flow chart of the linux login is still relevant.  What it told me is that the LOGIN DID work, and really only thing wrong is permission to the shell /bin/sh.

So I checked permission on /bin/sh and its librarys
$ ls -alF /bin/sh
lrwxrwxrwx 1 root root 4 2011-11-03 18:36 /bin/sh -> dash*
$ ls -alF /bin/dash
-rwxr-xr-x 1 root root 105704 2010-06-24 13:02 /bin/dash*

$ ldd /bin/dash
     linux-vdso.so.1 =>  (0x00007fff2c793000)
    libc.so.6 => /lib/libc.so.6 (0x00007fbe75206000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fbe755ae000)

everything was 755 or better.  Nothing wrong here it seems.  i go through permission of all the lib folders /lib /usr/lib, /user/local/lib .  All seems correct and undisturbed.

4) Now I started wondering, no permission, login working... what about logs
in /var/log

syslog.log
Feb 11 14:38:38 jsrawan-xps16 x-session-manager[2347]: WARNING: Could not connect to ConsoleKit: Could not get owner of name 'org.freedesktop.ConsoleKit': no such name
Feb 11 14:38:38 jsrawan-xps16 x-session-manager[2347]: WARNING: Could not connect to ConsoleKit: Could not get owner of name 'org.freedesktop.ConsoleKit': no such name
Feb 11 14:38:42 jsrawan-xps16 NetworkManager[1173]: <warn> error requesting auth for org.freedesktop.NetworkManager.use-user-connections: (5) Remote Exception invoking org.freedesktop.PolicyKit1.Authority.CheckAuthorization() on /org/freedesktop/PolicyKit1/Authority at name org.freedesktop.PolicyKit1: org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.PolicyKit1 was not provided by any .service files
Feb 11 14:38:42 jsrawan-xps16 NetworkManager[1173]: <warn> error requesting auth for org.freedesktop.NetworkManager.network-control: (5) Remote Exception invoking org.freedesktop.PolicyKit1.Authority.CheckAuthorization() on /org/freedesktop/PolicyKit1/Authority at name org.freedesktop.PolicyKit1: org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.PolicyKit1 was not provided by any .service files
auth.log
Feb 11 16:08:39 jsrawan-xps16 login[3660]: pam_unix(login:session): session opened for user testuser by jsrawan(uid=0)
Feb 11 16:08:39 jsrawan-xps16 login[3660]: pam_unix(login:session): session closed for user testuser
Okay the auth login doesn't given any better errors.  Syslog doesn't seem good, but I have no clue if that is the problem.  One thing I realize si the Syslog is occuring on bootup, not PER login.  So I rule that out for now assuming its a red herring.

5) 3 hours and many missteps and reboots, I remain confused, but convinced that only way an executable can't run is corruption or permission.  The former is not possible because under my root account its the same problem

So I do an strace on "login jag". I find the failure is indeed just the /bin/bash

strace -s 10000 -vfo login.jag login jag
4135  execve("/bin/bash", ["-bash"], ["TERM=xterm", "LANG=en_US.UTF-8", "HOME=/", "SHELL=/bin/bash", "USER=jsrawan", "LOGNAME=jag", "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games", "APPLICATION_ENV=development", "MAIL=/var/mail/jag", "HUSHLOGIN=FALSE"]) = -1 EACCES (Permission denied)
Awe, I realize that the /bin/bash is trying to access /HOME, which is probably fine and then it will cd to your ~.  But the /home is usually root owned, and what did it provide for 'other' permissions?  What should the permission be?

Okay to cut a long story short, I ran
$ls -alF /home /
And found out that the actually DIRECTORY was a 600 (rw- --- ---).  This meant that the bash login could not launch in that directory, which is a problem.
How bad was it ?  I read that you can get a Manifest file on what permission should be. Luckily for me I can tell that all the permission of files throughout the system seem good (i.e. I didn't run a chown -R, or chmod -R).  But the root "/" are 600.  I logged into another ubuntu server in the cloud, and low and behold, ALL ROOT FOLDERS are 755, with exception of paths /root and /lost+found are 600 and tmp is 777.

And there you go.  So the moral of the story is, if Linux breaks, 9/10 times its YOU that broke it.  Review your ~/.bash_history to see if you ran something malicious with chmod or chown.  And this is not Window, you do not need to re-install the OS everytime a registry settings goes bad.

Some other things I did that didn't help but could in other cases
- Ensure that passwd file is valid with $ pwconv
- run $ passwd -u <name>  to ensure the account is not locked out
- Create a new user with useradd, passwd to have a fresh example
- Removed the ~/.Xsession in case we have some cached login
- Never change permission on system files, without 100% being sure.  You can
- Ran apt-get upgrade --show-upgraded in case we had some broken packages
- Ran dpkg-reconfigure gdmsetup  ubuntu-desktop to check if packages had been deselected
- Ran ssh login to see if it was isolated to gnome or just general login.  

No comments:

Post a Comment