You are here
Jacob - Thu, 2015/10/01 - 19:24
I just created a brand new fileserver appliance on the HUB to compare differences between 13 and 14. But Webmin is not starting correctly. I will leave the instance running at testnewtklfs.tklapp.com
ssh is coming up. I can give you a password to login and check what is going on.
Forum:
Also not working in v13 anymore
Also launced new FS appliance v13, and surprise: Webmin not working either.
Must be something in the HUB. My wild guess is that is has something to do with elastic storage. I needed that for the big restore I was testing. Now I have a feeling it is expecting big storage for every new instance. I got this in the console messages.
I dont have enought knowledge of Linux or the workings of the HUB/AWS to solve this :(
v13 appliciance is located at: testfs13.tklapp.com
Not sure why it's not working for you?
It works fine for me: https://testnewtklfs.tklapp.com:12321/
Also your old appliance seems to be working too: https://testfs13.tklapp.com:12321/
My guess is that you forgot to add the "https://" at the start (http connection will fail even if you have the port right). I also checked Webshell while I was at it (just in case that was what you meant) but that works too...
strange
I had been trying for an hour or so, before posting here. I am sure I used https, not http.
Now I can login to Webmin just fine. Yesterday it would not even give me the login screen, just a server timeout.
Will try again today with a fresh new instance, and let you know.
[edit] Just installed fresh instance from scratch, and it is working perfectly. Only difference is I am working at a different location now. My next guess is a firewall / NAT issue. To test that hypothese I will go back to yesterdays location and see what happens... [to be continued]
[edit2] At my office now... Not working v14. The firewal is not logging anything strange. Can it have anything to do with routing and stunnel mapping 12321 to 10000 and webmin miniserv listening on either 12321 or 10000? Argh, my head is spinning. Could it be something with NAT / masquerading? The restore I did was from a v13 appliance to a v14 appliance. Maybe the webmin configs are overwritten by the restore??
This is gonna look like an involuntary crash course in Linux fundamentals III
[edit3] back home. Firing up my laptop, going to the hub. URL to appliance doesnt work (slow DNS propagation?) but the ip does. https://[ip address]:12321, et voila webmin!
Slow DNS propagation
2 webmin servers
When testing at the office, I have 2 Webmin servers active at the same subnet. Different ip's, but same port 12321. Could that cause problems?
That would explain why I cannot access the new appliance I am testing, but not the appliance deployed on the HUB.
Digging in my router / firewall documentation. It is a Mikrotik router, firewall is very much iptables.
looked at my firewall with a microscope, might have found something. Will have to test to make sure
[to be continued]
[edit] Seems I forgot to add the [in-interface] in my firewall rule. So every connection to a port 12321 would be validated and forwarded by this rule... Now I can access 2nd webmin server on my subnet and no more problems connecting to webmin servers on the HUB.
[edit2] After restoring v13 backup to my new v14 Fileserver appliance, I cannot access webmin on the new one anymore. Syslog gives me this:
[edit3] /etc/webmin/miniserv.conf files are absolutely different between v13 and v14. Main differences I already spotted are:
Can I just change the file or copy the v14 config file back?
[edit4] adapting /etc/webmin/miniserv.conf solved the problem, and now I can login to webmin again. Next problem: my Samba users do not exist anymore on the v14 appliance...
The users exist as Linux users, so I can convert them to Samba users again. This does not set the Samba password to be the same as the Linux password. Is the libpam-smbpass package removed from the appliance?
v14.0 is different
v14.0 shouldn't need a keyfile as it isn't doing an TLS connections (it only connects locally and stunnel handles the TLS).
In v13.0 Webmin served itself directly on 12321 (instead of default 10000).
As for your Samba users not being included, that doesn't sound how it should be. Is all the data there in the right places? Also perhaps it is meant to do that as the fileserver appliance it should auto convert Linux users to Samba users (and sync passwords). Perhaps the job that does that just hasn't run yet? You could manually do that from within Webmin but TBH I'd be interested in digging a bit deeper. I'm not sure why it doesn't include them? If you wanted you could do some detective work on your backup to try to make it better/smoother...
Detective work
After I restored the v13 backup to the brand new v14 appliance the webmin config was broken by the /etc/webmin/miniserv.conf being overwritten from the backup. Copying back the default from a fresh v14 appliance solved my problem.
I noticed only 2 Samba users were created: 0-root, and 1000-[username]. No idea why user 1000 was created and others were not. Using a script, generously provided to my by a Linux guru, I checked with rsync for missing data files after the restore. Seems pretty complete ;)
I used the Samba 'convert users' function to re-create the other samba users. Then I tried to logon to the Samba server with one of those users, no luck. Checking the samba logs, it showed authorization failure.
Fiddling around a bit with smbclients, manualy updated the Samba password for this user, and could login to samba again. At Linux level the passwords were the still okay after the restore, but not for Samba. I looked a little bit into password synchronization. It appears a package with the name libpam-smbpass is responsible tor the syn. In Webmin I searched for this in the installed packages and got this: Package is no longer installed.
Trying to install the package gives: libpam-smbpass is already the newest version.
Should I try to remove and re-install the package?
I guess I should wait a bit before I replace my live v13 fileserver with v14...
From the commandline:
cli
This is the output:
So Webmin is clearly not reporting correctly. Where to dig now?
Hmm ok so that's installed...
So to recap you have all your user accounts still; just they are only Linux users not Samba users? Is that right? If so I know it's not ideal but p[erhaps you could try manually clicking the "sync Linux users to Samba users" button in Samba settings in Webmin. TBH I have no idea what the command is to run that from the commandline... What (if anything) happens?
Linux users converted to Samba
Correct. I still have all my Linux users, even old ones that were already disabled. Only 2 users were visible (converted to) in Samba:root (0) and [username] (1000). The last one is also the only user allowed to ssh. Another 15-20 samba users were not converted to samba on the new v14 appliance.
I already manually converted (a subselection) of Linux users to Samba users, using the Webmin environment. Now they exist as Samba users. But the Linux password is not sync'ed as Samba password. For one user I knew the password and manually updated the Samba password in Webmin. This user now can access the Samba shares and files again.
The problem is two steps:
Any suggestions for next steps? Does this affect the adviced approach and/or scripts for upgrading the TKL Fileserver appliance to Jessie?
Webmin does not register packages correctly
Hi Jeremy, still having problems with Webmin. Webmin is reporting packages are no longer installed, while I am sure the are indeed installed. Should I open a new thread?
testing
I am thinking about testing the Linux / Samba users, using 4 scenarios creating new users.
So for user1, I will first create it in Linux and then convert to Samba, using Webmin
I hope this will show if Webmin and its mgt scripts are the issue.
[edit] the results are in:
user1, Webmin, Linux 1st
Create user in other modules:true
user1 is created in Samba
smbclient login with -U user1 works
user2, Webmin, Samba 1st
not a Webmin function
user3, cls, Linux 1st
useradd user3
not yet created in Samba
smbpasswd –a user3
Now created in Samba
smbclient login with -U user3 works
user4, cli, Samba 1st
smbpasswd -a user4
New SMB password:
Retype new SMB password:
Added user user4.
Linux user4 is created, no home directory or shell, inactive, primary group=users
Also created in Samba
smbclient login with -U user4 works
Everything seems to work as it should. Even with webmin telling libpam-smbpass is no longer installed.
The errors and unexpected behaviour are somehow related to the proces of restoring a v13 backup to new v14 appliance, but where to look?
Can this be something?
That libpam-smbpass thing kept nagging me. So I decided to try and re-install the package. That gave an error. Apparently there is two password files with the name passdb.tdb on my system now.
Can this be caused by the upgrade to v14 and restore v13 data process? How should I proceed now? (Im digging deeper in Linux than I've ever done before, so feeling a bit unsure)
[edit] I just verified. My 'old' v13 appliance has the file in /var/lib/samba/passdb.tdb
a fresh v14 appliance uses /var/lib/samba/private/passdb.tdl
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=726472 indicates this may cause the kinds of problems we are running into. But to be honest, I do not understand all the expert talk they are using... Wich one of the passwd files should I use? Delete the other one?? Or do I need to start from fresh, and make some changes to the appliance, before I restore my v13 data??
Great work!
I have been battling with the VM builds of the v14.0 appliances. I expected it to be fairly straight forward but by midweek I realised that it was going to be much trickier than I imagined. So I locked myself away to try to resolve that. I had a major breakthrough late last week so very close now. But I digress; back top your issue...
I must say, great detective work! It sounds like you have essentially resolved the issue! Yay! :)
So I suggest that you copy the 'old' secrets.tdb & passdb.tdb from /var/lib/samba/ to the new location of /var/lib/samba/private/
I.e. on your v14.0 appliance try this:
Then restart samba and see what happens then... From my reading of the bug report you link to; that should fix it! Fingers crossed...
If that all works then you could just take a new backup of your v14.0 appliance. Although it may pay to keep your v13.0 appliance backup for a little while until you are sure that you no longer need it...
tnx
Does not seem to solve everything. Perhaps I already tested and changed too much.
I will restart with a fresh v14 appliance, do the restore again, and then process the changes needed to get Webmin and Samba working again. (is that something that can be automated for all other fileserver apl. users who still have to upgrade??)
Will help others for sure
For your purposes once you have it sorted and do a backup of your new server then you won't need to worry about it.
Steps needed to get upgraded Fileserver working
These are the steps I performed to test the fileserver upgrade, and the additional steps needed to get Webmin and Samba behaving.
I know I don't need to reboot so often, restarting services should do. But I wanted to be sure I was not missing one or two.
I also have compared the v13 and v14 miniserv.conf files. There are more diffences than just the 3 lines I changed. Replacing works for me, but you may want to dig deeper. Sorting both files and comparing with diff:
The new use of stunnel in v14 will break more appliances. I hope this detective work does help a tiny bit in creating an automated fix for all the upgraders out there :D
Thank you so much!
I guarantee it will be helpful for other users so thanks heaps for taking the time to write it all up! :)
Webmin still messed up
I took a leap of faith (with enough backups and a fallback scenario) and converted our production fileserver to v14 and the restored the v13 data back to it. Using all the steps I described I got it up and running during the weekend. On monday our users didn't even notice any difference. Smooth migration I would say.
But... Webmin is seriously messed up (and of course the usual nagging log messages I have to get rid of one-by-one).
I would say it is installed. But in Webmin => System => Software Packages, Installed Packages, search for package "libpam-smbpass" gives:
Error
Package is no longer installed
This is not just with libpam-smbpass, also with postfix. Postfix doesnt even show up in the Webmin menu under servers...
When in Webmin I look at the package tree of installed packages, I tried the first 20 or so packages, they all give the message: package is no longer installed.
I have the feeling Webmin is keeping its own list of packages, and it somehow got out of sync with reality. Is there any way to 'rebuild' it?
Would you do this on a production server?
I found a discussion about Webmin not seeing installed packages here: http://ehc.ac/p/webadmin/discussion/600155/thread/ddaa8ed6/
It recommends to execute this command:
But I don't understand 100% what it does. And I prefer not to test on my production server. What would you do?
TBH I don't use Webmin much
FWIW that command won't work OOTB on TurnKey. you'll need to install aptitude first. Although IMO that's fairly heavy handed and I probably wouldn't do it - certainly not on a production server unless it was a last resort!
FYI it is basically 3 commands rolled into one (the '|' - called a pipe; forwards the output of the prior command to the next command). So "dpkg --get-selections \*" basically lists all the packages installed. "awk '{print $1}'" removes everything but the first part of the output (so you just have a long list of packages). Then the "xargs -l1 aptitude reinstall" interates through the list of installed packages and reinstalls them all...! So essentially it reinstalls everything!
Something to check re Webmin postfix is make sure that the webmin-postfix module is installed. IIRC correctly that's exactly what it's called so you can check with "apt-cache policy webmin-postfix" and if it's not installed then you can install it with "apt-gt install webmin-postfix".
Regarding the issues with the Samba webmin module you could try removing it and reinstalling it:
Not testing on prod
I will not test on my prod server. So today I decided to create a fresh instance on AWS. My idea was creating a clean instance, and then instead of restoring from TKLBAM, migrating all users and then the data and configs for Samba.
Directly after getting it up, before making any modification or configuration, I checked the installed packages in Webmin. And guess what: Error, Package is no longer installed.
For every package I checked, including Samba, Postfix and Webmin it self. I expected a clean image to run flawless, but that is not the case.
Perhaps I should downgrade back to TKL v13?
I think I know what the problem is!
When you search for or install software from the commandline you use the "apt-get update" command to update the local database of available packages prior to searching or installing. It seems that Webmin leverages the commandline tools for it's built in package management. So it can't work out what packages are installed until this update is run.
From the commandline "apt-get update" should fix it; or from within Webmin click the (IMO poorly named) "Upgrade Now" button right at the bottom of the Software Packages page (Webmin => System => Software Packages).
Let me know how it goes...
FYI I didn't actually test the Fileserver appliance; but I did confirm that postfix is visible as installed in Webmin on Core (and Fileserver inherits both Webmin and Postfix from Core so if it works in Core then it should work everywhere).
Also assuming that I'm right; then this should not be an issue if you install security updates when you first launch (it will run apt-get update as part of the process) or if you wait 24 hours (the auto security updates also run apt-get update as part of the process). That may also explain to me why nop one else has mentioned this issue.
nope
Just restarted the stopped instance. First tried the "upgrade now" from webmin. That did not work.
Then logged in as root and did the "apt-get update" from cli. Webmin still shows basic packages as "no longer installed"
Postfix is installed and visible from the Webmin-Servers menu. (But not as installed package).
I will keep the instance running for 24 hrs now. See if that makes any difference.
Hmm, ok. Sounds like I need to dig deeper...
However it's clear that I need to properly test it on the Fileserver appliance; and perhaps even do a migration so as to make sure that I can reproduce the issue first.
Perhaps there is a bug in Webmin related to the switch from Samba3 (TurnKey v13.0/Debian Wheezy) to Samba4 (TurnKey v14.0/Debian Jessie)? A quic google didn't bring any info though so perhaps not... Perhaps there is something being included in the TKLBAM backup (related to Webmin config) that shouldn't be (something overwriting the newer version of Webmin config included in v14.0)?
clean fileserver v14 instance
Jeremy, my last test is a clean fileserver v14 instance via the Hub. Completely standard T1.micro, but Root filesystem size changed to 100 GB. No configuration or modifications. No extra installs. Only applied security updates. I did not restore anything with TKLBAM, so this should be a lot easier to reproduce for anyone else. It has Postfix, also in the server menu item (that got messed up in earlier tests, because of my restore, old srv had no postfix so it got removed with the restore).
But when I go to system - software packages, top right button 'package tree', expand first line A-E, the first 10 or so packages I click on, all give the 'no longer installed' message.
It may be nothing, but I am not well versed in the cli, and use webmin for more than 50% of my sys.adm. tasks. (Gone are the ms-dos 3.20 days, when I knew a lot of hex codes for INT 21h from the top of my head... It seems I moved from kernel to user space, lol)
One more lead: after a reboot this is in the auth logfile. Notice the webmin auth failure
I can reproduce...
I did a quick google and found this thread. Which makes me think that something has changed for Debian Jessie. Perhaps there is a dependency missing? Or maybe there is an undocumented Webmin bug?
To investigate the missing dependency angle, I double checked the Webmin Debian install docs and it appears that there is indeed a dependency missing: apt-show-versions
I thought that seemed really promising so I installed it and restarted Webmin but it appeared to make no difference... :(
Also I notice that 1.770 is out
Also I've posted a bug with Webmin (see here to see if upstream have any ideas.
I have tested the 1.770 provided by Webmin (upstream)
So it's either a Webmin bug or a missing (but undocumented) dependency...
We have a work around...
w00t
That works!
So we already solved 3 issues in one forum thread ;)
I hope these solutions all can be automated for future upgraders.
We will workaround 3 for sure
However I will discuss this more with upstream. IMO ideally it should gather this information using apt instead (i.e. in a way that doesn't rely on dselect). AFAIK using dselect's package info is that there is a risk that the info will get out of sync (with apt) over time. However I don't know enough about hos dselect and apt interact to be sure...
As for 1 and 2 - I would love to automate those fixes too and it can be done; with TKLBAM hooks. However I haven't used them before so will need to have a play with TKLBAM to work out how that should be done... Until then; just having it documented is a great start.
Thank you so much for your great work on these issues!
FWIW I have created 2 "issues" related to this
Add new comment