Takes longer to solve in new version primary due to items now not finding a solution in presolve #175

xnaegamvala · 2020-11-09T20:34:02Z

Forwarded from our development team:
While upgrading from Symphony version 5.6.14 to version 5.6.17 we are seeing some strange behavior on both our linux and windows builds. The examples below were generated using the windows build but we are seeing similar behavior on linux as well. We have a small test case with 2,278 rows and also 2,278 constraints (LPT file attached in scenario.zip).

On version 5.6.14, this case solves almost immediately. On version 5.6.17, the case takes 2 to 2.5 minutes to solve. The output logs with verbosity level = 50 are attached. The default settings are being used (do_primal_heuristic=1, prep_level=5) and we have first feasible activated.

The runs are similar until it gets to the "Now displaying final relaxed solution..." message. The old version finds the solution right away using heuristics while the new version goes into iterations. So far, only this test scenario is showing the issue. -Bill

xnaegamvala · 2020-11-09T20:36:49Z

5.6.17.new.txt
5.6.17.new.txt
scenario.zip

tkralphs · 2020-11-13T04:34:54Z

Unfortunately, I can't replicate this. With the parameter file

do_primal_heuristic 1
prep_level 5
find_first_feasible 1

which I understood to be the one you're using, I got an answer back almost instantly in both cases. Solving to provable optimality take substantially longer, but the gap is nearly zero in the root node, so there doesn't seem to be any need to go further. My output looks like this.

Preprocessing finished...
 	 constraints removed: 123
Problem has 
	 1180 constraints 
	 2278 variables 
	 119277 nonzero coefficients

Total Presolve Time: 0.137313...

Solving...

granularity set at 0.000000
[New LWP 13023]
[New LWP 13024]
[New LWP 13025]
[New LWP 13026]
[New LWP 13027]
[New LWP 13028]
[New LWP 13029]
Thread 3 now processing node 0
****************************************************
* Now processing NODE 0 LEVEL 0 (from TM)
****************************************************



**** Starting iteration 1 ****

solving root lp relaxation
The LP value is: -349573521.608 [0,1296]

SH-FEAS: 0 --- -349553186.045000 
LS-FEAS: 0 --- -349553186.045000 
****** Found Better Feasible Solution !
****** After Calling Heuristics !
****** Cost: -349566986.753000

fathoming node (no more cols to check)


****************************************************
* Stopping After Finding First Feasible Solution   *
* Now displaying stats and best solution found...  *
****************************************************

My pre-solve time is substantially faster than yours, which is strange unless you are using a very old processor. I did set the prep level to 5, as you said you did, and I got the same number of constraints and variables but a different number of non-zeros. So it seems we are not solving quite the same problem. Maybe that is a clue?

You seem to be using your own driver, so I can't really be sure if there could be differences there. Could you try with the SYMPHONY executable that is automatically built and see if you have the same result? I was wondering if it might have something to do with running multi-threaded so you could also try building without OpenMP support and see if that makes a difference. If you could tell me exactly how you built SYMPHONY, that would help (what arguments to configure, etc.).

I looked over the diff between the two versions and there really isn't any change that would have affected the solve time, so there's not a lot for me to go on. I did actually find a bug related to I found a bug related to the printing of the lower bound, but it shouldn't affect anything here.

So I'm kind of at a dead end unless you can point me to some more clues or information.

BMagan01 · 2020-11-16T21:30:31Z

Hi Ted,

I am one of the developers working with xnaegamvala. Following up a bit on the above.

Does "using your own driver" just mean that we are using our own wrapper to load our own binary file ? We generate the LP file right after we load the problem from our own file so I suppose the LP file generated might be a for a slightly different problem.

Our linux build (on a centos environment) uses the coinbrew utility to build the shared objects - the wrapper executable dynamically loads these shared objects. On windows, we are just using the provided visual studio projects (VS2015) to build the various symphony libraries that are statically linked into our wrapper executable.

In my local testing of the issue on windows, I forced the number of threads to 1 (sym_int_TM_max_active_nodes=1). I could attempt to build without OpenMP support. However, I believe that by setting the processing threads to 1 parallel processing can probably be ruled out as the root cause.

From the debugging we have done so far, the bottleneck for this particular problem is happening in the ClpSimplexPrimal method. Version 5.6.17 finds infeasible solutions in there. Variables largestPrimalError_ and largestDualError_ contain very large numbers for over approximately 1,000 iterations before the method gives up and the problem is presumably solved by another method. I have observed the differing behavior in routine statusOfProblemInPrimal().

We added logging in a number of places in an attempt to spot differences in behavior between versions. The logging line 'statusOfProblemInPrimal 17 reason2" was added in CLPSimplexPrimal.cpp around line 1100. A file with the output is attached below. Hoping you can lend some insight into why this behavior might be happening. In version 5.6.14, this behavior was not observed (and reason2 never appears to be set equal to 2). If the looping is short circuited after say 10 or 15 iterations - the problem will solve almost immediately.

      if (numberThrownOut)
           reason2 = 1;
      if ((sumInfeasibility > 1.0e7 && sumInfeasibility > 100.0 * lastSumInfeasibility
                && factorization_->pivotTolerance() < 0.11) ||
                (largestPrimalError_ > 1.0e10 && largestDualError_ > 1.0e10))
           reason2 = 2;
	  printf("statusOfProblemInPrimal 17 reason2 = %i, sumInfeasibility = %f, lastSumInfeasibility = %f, factorization_->pivotTolerance() = %f, largestPrimalError_ = %f, largestDualError_ = %f\n", reason2, sumInfeasibility, lastSumInfeasibility, factorization_->pivotTolerance(), largestPrimalError_, largestDualError_);

statusOfProblemInPrimal_17.txt

tkralphs · 2020-11-23T02:11:23Z

Does "using your own driver" just mean that we are using our own wrapper to load our own binary file ? We generate the LP file right after we load the problem from our own file so I suppose the LP file generated might be a for a slightly different problem.

I just mean that you are not using SYMPHONY's own main function to read in the instance, so there could be paramters being set and other things going on that I don't have a view of. If you were using the SYMPHONY default binary, it would print out all the parameter settings and everything would be more transparent to me.

In any case, the problem seems to be with Clp, not directly with SYMPHONY, so if we can isolate the issue, we can transfer it over to Clp and perhaps get some more eyeballs on what's happening. Given that I'm not seeing the same thing as you, I suspect that maybe you are somehow using a different version of Clp. So first thing would be to verify the precise SHA you have checked out with your 5.6.17 build. The lines you quoted above are actually around line 1000 in both stable/1.17 and master, so I'm wondering if you have a different version somehow.

BMagan01 · 2020-11-23T23:53:39Z

I can attempt to dump all the symphony settings from within our own main function.

When I referenced line "1100" in my prior comment, I failed to say that I have added a number of lines of debug code that helped trace down where the problems were starting. If I ignore my debugging code, the code referenced above (if (numberThrownOut) ) actually starts at line 1006.

I did not directly use git to do the checkout. On windows, I downloaded the code from here: https://www.coin-or.org/download/source/SYMPHONY/SYMPHONY-5.6.17.zip. On linux, I used the coinbrew utility to request/download version 5.6.17. I will pull the code directly from git - release/5.6.17 (commit f917d42) and compare that to the code in our build environment.

tkralphs · 2020-11-24T14:36:48Z

I want to make sure it's clear that SYMPHONY utilizes a number of other COIN-OR projects as dependencies and you an mix and match versions of those within the same stable series. SYMPHONY 5.6.17 is a fixed version, but it can work with any Clp in the 1.17 series. What seems to come with the ZIP file above is 1.17.0, but I was testing with the latest release 1.17.6, which I think is what you should get by default if you just use coinbrew as

coinbrew fetch SYMPHONY@5.6.17

To see what you're really using, you would need to go into each dependent project and do git branch. Here's a quick bash script to do the check:

for i in *; do if [ -e $i/configure ]; then echo "Project: $i" ; cd $i; git branch; cd -; fi; done

You can post the output if you want me to check it. For at least the Windows build, it seems you are not using the latest Clp release, so the first thing to try is to upgrade that to 1.17.6. Make sure you are on 1.17.6 in Linux as well.

BMagan01 · 2020-11-24T23:48:46Z

Thanks for the response, In our linux build environment, running the above bash script returns this:

Project: Cgl

(no branch)
/home/centos/coinbrew
Project: Clp
(no branch)
/home/centos/coinbrew
Project: cmake-3.8.2
master
/home/centos/coinbrew
Project: CoinUtils
(no branch)
/home/centos/coinbrew
Project: Osi
(no branch)
/home/centos/coinbrew
Project: SYMPHONY
(no branch)
/home/centos/coinbrew

When I pulled the latest code using coinbrew, the command I used was coinbrew fetch SYMPHONY:releases/5.6.17 --no-prompt. Is it correct to assume that this will do the same thing as the above command (coinbrew fetch SYMPHONY@5.6.17). I ran coinbrew fetch SYMPHONY@5.6.17 - it did not fetch anything and the script returned the same result as above.

Running git show from the Clp folder returns this:

commit 756ddd3ed813eb1fa8b2d1b4fe813e6a4d7aa1eb
Author: Stefan Vigerske svigerske@gams.com
Date: Sat Apr 11 15:51:55 2020 +0000

creating Clp/releases/1.17.6 from Clp/stable/1.17 (rev 2700)

git show under the SYMPHONY folder returns this:

commit f917d42
Author: Ted Ralphs ted@lehigh.edu
Date: Sun Feb 24 15:00:47 2019 +0000

creating SYMPHONY/releases/5.6.17 from SYMPHONY/stable/5.6 (rev 2733)

I have not done a comparison between the windows and linux code in our builds. I will do that and also build using 1.17.6 on windows.

tkralphs · 2020-11-25T02:26:59Z

OK, so you are using 1.17.6 on Linux. Hmm, that is strange, since I can't replicate what you're seeing. What distro and compiler are you using? If we can isolate the behavior down to something replicable in stand-alone Clp, it would help. There are some debugging routines that dump the current LP relaxation to file. Ideally, you would simply dump the current LP to disk once you observe this behavior you're describing and we could see whether we can replicate the behavior outside of SYMPHONY. If it's possible, could you call Clp's writeLP() function to dump the current LP upon some trigger that indicates it is exhibiting bad behavior?

BMagan01 · 2020-11-25T20:16:20Z

Our distro and compiler are as follows:
CentOS Linux release 7.4.1708 (Core)

gcc.x86_64 4.8.5-39.el7 @base
gcc-c++.x86_64 4.8.5-39.el7 @base
gcc-gfortran.x86_64 4.8.5-39.el7 @base

I have not yet done any additional code comparisons (as mentioned above). When I do that, I will also call Clp's writeLP() at an appropriate time.

BMagan01 · 2020-11-30T16:50:52Z

The files in the attached .zip file are created in ClpSimplexPrimal.cpp inside this if block:

if ((sumInfeasibility > 1.0e7 && sumInfeasibility > 100.0 * lastSumInfeasibility
	&& factorization_->pivotTolerance() < 0.11)
	|| (largestPrimalError_ > 1.0e10 && largestDualError_ > 1.0e10)) {

Inside that block when if (largestPrimalError_ > 1.0e10 && largestDualError_ > 1.0e10), the files are generated using writeLp().

The format of the file names is largestPrimalError_YYYY-MM-DD_h_m_s_<numberIterations_>_.lp. The test was run with sym_int_TM_max_active_nodes=1. This file will generate for each of the 1,000 iterations. In the attached .zip file, files for the first 6 iterations are included. I believe they all of these files are
identical.

primal2.zip

xnaegamvala changed the title ~~Performance increase in new version primary due to items now not finding a solution in presolve~~ Takes longer to solve in new version primary due to items now not finding a solution in presolve Nov 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Takes longer to solve in new version primary due to items now not finding a solution in presolve #175

Takes longer to solve in new version primary due to items now not finding a solution in presolve #175

xnaegamvala commented Nov 9, 2020

xnaegamvala commented Nov 9, 2020

tkralphs commented Nov 13, 2020

BMagan01 commented Nov 16, 2020 •

edited

Loading

tkralphs commented Nov 23, 2020

BMagan01 commented Nov 23, 2020

tkralphs commented Nov 24, 2020

BMagan01 commented Nov 24, 2020

tkralphs commented Nov 25, 2020 •

edited

Loading

BMagan01 commented Nov 25, 2020

BMagan01 commented Nov 30, 2020

Takes longer to solve in new version primary due to items now not finding a solution in presolve #175

Takes longer to solve in new version primary due to items now not finding a solution in presolve #175

Comments

xnaegamvala commented Nov 9, 2020

xnaegamvala commented Nov 9, 2020

tkralphs commented Nov 13, 2020

BMagan01 commented Nov 16, 2020 • edited Loading

tkralphs commented Nov 23, 2020

BMagan01 commented Nov 23, 2020

tkralphs commented Nov 24, 2020

BMagan01 commented Nov 24, 2020

tkralphs commented Nov 25, 2020 • edited Loading

BMagan01 commented Nov 25, 2020

BMagan01 commented Nov 30, 2020

BMagan01 commented Nov 16, 2020 •

edited

Loading

tkralphs commented Nov 25, 2020 •

edited

Loading