Bug 550417

Summary: Yast MUST NOT kill SQUID in middle of system update!
Product: [openSUSE] openSUSE 11.2 Reporter: Karl Eichwalder <ke>
Component: Release NotesAssignee: Karl Eichwalder <ke>
Status: RESOLVED FIXED QA Contact: Stephan Kulow <coolo>
Severity: Major    
Priority: P5 - None CC: aj, coolo, fs, hvogel, ke, ma
Version: Factory   
Target Milestone: RC 2   
Hardware: x86-64   
OS: openSUSE 11.1   
Whiteboard:
Found By: Documentation Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Karl Eichwalder 2009-10-27 14:28:54 UTC
+++ This bug was initially created as a clone of Bug #540587 +++

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3

I was doing a system update.  Normally my web requests go through squid to take advantage of a large local (6GB) disk cache for various downloaded objects.

However, yast also pays attention to the global vars HTTP_PROXY and friends, and uses them during its operations.  Normally not a problem.

However I was doing a Factory update.  That just happened to include squid this time.  About half way through a 500MB download/update it suddenly lost its connection -- and hung.  Restarting squid didn't help.  It was dead and required restart -- at which point trying to redo the factory update was futile -- it thought everything was done -- BUT the configuration of every service it updated was *wiped* and set to 'off'.  

I'm having to go through each service as I find which ones it killed and wiped and reconfig or at least restart them.  But in a few cases, the config files have been reinitialized to non-workable values.  Examples:

xinetd was setup with no tcpmux builtin configure, so installed services like vnc generated errors because the standard vnc-server needs TCPMUX.  

The arpwatchd was no longer looking on my inside net, but on my outside
and talking about bogons because it lost it's local group list.  It also was no longer in its own group.  named is now generating messages about not having write access to its working directory (don't know if this is a new message, or something was changed).

The squid startup script was replaced and the old one wasn't saved -- I had it set to filter some common error messages (restored from an edit backup).  Might be a few others -- several services were just 'stopped (chkconfiged 'off') when they had been on.

Have yet to do an exhaustive search for problems, as I'm still in firefighting mode.  But Yast really needs to be smart enough to check, if it is using a proxy server, AND proxy server is on same system, it needs to be "very circumspect about restarting it (if it should restart it at all,   maybe it should ask user or wait for reboot?)...but if it closed all connections and was sure it would restart, then it could auto-restart it just like when log-rotation happens periodically.  

Marking this Critical, as it has cost a large amount of time to recover from this (still in progress)....  and did lose some data (though I probably have
earlier backups I could restore from, I'm taking opportunity to recheck configs.

I am noticing some (1-2?) incompatible config files with the new versions that were installed.  Will try to remember to file more bugs later...but filing bugs takes me away from fixing, and my email is another service that is broken right now.  

(was trying to fix some configuration problem and I've made things worse! --- 
;^)  so what else is new...but if it hadn't been broken in the first place, I probably wouldn't have been mucking w/it!...but hey, it *is* a learning experience, so I'm not "angry", more like 'annoyed'...and wanting to create this bug as a notice of the problem.

Reproducible: Didn't try

Steps to Reproduce:
1. have "alot" of updates (I didn't see all the updates -- it autoselected "A bunch" more than I had picked out...(as usual ;^/  )...dependency hell!
2.  I was updating through a proxy server (squid) running on the machine being updated -- I didn't explicitly ask yast todo this, it just took the values from my environment vars (HTTP_PROXY & such)...
3. One of the packages updated, "apparently" was squid...right in the middle of the updates.  It left my system in an inconsistent and ill-or-unconfiged state with many serviced turned off (that had been on ).
Actual Results:  
System way messed up.

Expected Results:  
Normal factory upgrade.


==================================================================

aj:

Karl, please add the following to the release notes:

If you update with zypper dup, packages might get restarted during the process
and it can happen that the restart does not succeed before you adjust the
config files.  This is especially critical if your system uses services needed
for downloading the update, e.g. a local proxy (squid) on the machine you
update.

Set commit.downloadMode = DownloadInAdvance in /etc/zypp/zypp.conf so that
first everything is downloaded and then packages get installed.  This needs
enough space on the /var partition to hold all downloaded packages.
Comment 1 Karl Eichwalder 2009-10-27 14:38:09 UTC
Here is my version:

  <!-- bnc#550417 -->
  <sect3 id="zypper-dup">
   <title>System Upgrade with zypper</title>

   <para>If you update with <command>zypper dup</command>, packages might get
restarted during the update process.  It can happen that the restart does not
succeed before you adjust the config files.  This is especially critical if
your system relies on services needed for downloading the update packages,
e.g. a local proxy (squid) on the machine you update.</para>

   <para>
Set <literal>commit.downloadMode = DownloadInAdvance</literal> in
<filename>/etc/zypp/zypp.conf</filename> so that everything is downloaded
first, before the packages get installed.  The download transaction needs a
huge amount of space on the <filename>/var</filename> partition to store all
the software packages.
</para>

  </sect3>
Comment 2 Karl Eichwalder 2009-10-28 08:13:09 UTC
created request id 23312