Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix corner case of binary search #41

Merged
merged 2 commits into from
Jun 3, 2024
Merged

Conversation

AL-JiongYang
Copy link
Contributor

The PR fixes the error discussed here: #40, where ApproxMC gets stuck in infinite loops.

The original issue triggered the error when the generated XORs were unfortunately always empty, and therefore, we can always find threshold solutions and increase the number of XORs up to n, aka the number of variables. However, the ApproxMC unexpectedly gets stuck in infinite loops at n-1 XORs.

The error is due to a common corner case in binary search, as shown at this line:

hashCount = (lowerFib+upperFib)/2;
When we approach the last step of binary search: lowerFib + 1 = upperFib, the integer division always returns lowerFib and discards the fraction (i.e., 0.5). As a result, lowerFib is always assigned to hashCount, and hashCount can never visit upperFib. That's why ApproxMC gets stuck at n-1 XORs when lowerFib = n-1 and upperFib = n.

However, the corner case doesn't lead to an error in other cases of lowerFib + 1 = upperFib. This is because the new value of upperFib is always guaranteed to have been visited before the assignment. See the visit:

threshold_sols[hashCount] = 0;
, and the following upperFib updates:
upperFib = hashCount;
upperFib = hashCount;
. The property always holds except for the initial value of upperFib at this line:
int64_t upperFib = total_max_xors;
, where the total_max_xors is equal to n. That's why we only encountered this error when upperFib = n.

Therefore, the fix is to add the above property back to the initial value of upperFib. We assume that when the number of XORs is n, we can only find one solution because there are at most 2^n solutions for n variables, as shown in the submitted commit.

@msoos
Copy link
Collaborator

msoos commented Sep 25, 2023

Please get the fuzzer from Anna and run it against this ApproxMC for at least 10k rounds.

Please also run 10k fuzz runs also under Valgrind.

I have a feeling this fix is wrong in more ways than one.

@AL-JiongYang
Copy link
Contributor Author

I'll do that, although I think the bug (and the fix) is very unlikely to trigger in practice. For example, given 7 variables, the probability of obtaining 7 empty XORs is 1 / (2^8)^7 = 1/2^56.

@msoos
Copy link
Collaborator

msoos commented Sep 25, 2023 via email

@kuldeepmeel
Copy link
Contributor

Now that we have the algorithm certified; I think this should be good to be accepted.

@msoos msoos merged commit bef1ac8 into meelgroup:master Jun 3, 2024
4 checks passed
@msoos
Copy link
Collaborator

msoos commented Jun 3, 2024

Merged, thanks! I'll run the fuzzer to check :)

@msoos
Copy link
Collaborator

msoos commented Jun 3, 2024

The fuzzer found that this lead to an UNSAT->SAT bug when AppMC was called with an unsat problem. Anyway, fixed!

@@ -488,6 +488,9 @@ void Counter::one_measurement_count(
int64_t num_explored = 0;
int64_t lower_fib = 0;
int64_t upper_fib = total_max_xors;
threshold_sols[total_max_xors] = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the sampling set is empty, total_max_xors can be 0, which in turn makes this set the number of solutions to 1. Fuzzer is smart! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants