Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bias potential matrix #11

Open
and-tos opened this issue Feb 21, 2018 · 50 comments
Open

Bias potential matrix #11

and-tos opened this issue Feb 21, 2018 · 50 comments

Comments

@and-tos
Copy link

and-tos commented Feb 21, 2018

Hi,
I wanted to make sure that I understand the definition of the bias potential matrix correctly.
Assuming an umbrella sampling experiment with 4 windows A, B, C, D and 3 states 0, 1, 2. The restraining potential defined as k*(x-x0)^2. Should the cell A0 contain the following?
x=median value of bin 0,
x0=target value of window A

@lukas-stelzl
Copy link
Contributor

lukas-stelzl commented Feb 23, 2018

Hi @andt88, thanks for opening the issue. I'm sorry for the slow response, I have been travelling :).

Yes, you're right about the bias matrix. The cell A0 should contain:

k*(x-x0)^2. Should the cell A0 contain the following?
x=median value of bin 0,
x0=target value of window A

For umbrella sampling, the easiest thing is to define the bias energy of the bin/micro state via the center/median of the bin/micro state, as you have done. See also the following paragraph from the paper on DHAM (which preceded DHAMed) https://pubs.acs.org/doi/full/10.1021/ct500719p

In US, the unbiased potential surface is modified by applying a (typically harmonic) bias potential ui(k) in each window k. This bias is evaluated at the center of the ith bin in the kth simulation, which for a harmonic potential centered at x(k) would result ..

Does you bias matrix have 3 rows and 4 columns, for your 3 states and 4 windows? Do you get a sensible free energy profile?

@and-tos
Copy link
Author

and-tos commented Feb 27, 2018

Hi Lukas, thanks for your response. My matrix is actually larger. I am trying to get this to run and provide feedback.

@and-tos
Copy link
Author

and-tos commented Feb 27, 2018

I am now running into a zero division error:
print(len(c_l)) gives me 46, which is the number of simulation windows
print(c_l[0].shape) gives me (10, 10). 10 is the number of bins I defined.

dhamed.py:20: RuntimeWarning: divide by zero encountered in log
  og = run_dhamed(c_l, -np.log(v_ar), numerical_gradients=False, jit_gradient=True)
58
loglike-start nan
nan
Traceback (most recent call last):
  File "dhamed.py", line 20, in <module>
    og = run_dhamed(c_l, -np.log(v_ar), numerical_gradients=False, jit_gradient=True)
  File "/storage/home/local/andtos-loc/Dropbox_2/PhD/IFN/IFN_2018_20/DHAMed/pydhamed/optimize_dhamed.py", line 182, in run_dhamed
    numerical_gradients=numerical_gradients, **kwargs)
  File "/storage/home/local/andtos-loc/Dropbox_2/PhD/IFN/IFN_2018_20/DHAMed/pydhamed/optimize_dhamed.py", line 220, in min_dhamed_bfgs
    fprime=fprime, **kwargs)
  File "/usr/lib64/python2.7/site-packages/scipy/optimize/optimize.py", line 889, in fmin_bfgs
    res = _minimize_bfgs(f, x0, args, fprime, callback=callback, **opts)
  File "/usr/lib64/python2.7/site-packages/scipy/optimize/optimize.py", line 943, in _minimize_bfgs
    gfk = myfprime(x0)
  File "/usr/lib64/python2.7/site-packages/scipy/optimize/optimize.py", line 292, in function_wrapper
    return function(*(wrapper_args + args))
  File "/storage/home/local/andtos-loc/Dropbox_2/PhD/IFN/IFN_2018_20/DHAMed/pydhamed/optimize_dhamed.py", line 95, in grad_dhamed_likelihood_ref_0
    grad = _loop_grad_dhamed_likelihood_0_jit(grad,g, ip, jp, ti, tj, vi, vj, nijp)
ZeroDivisionError: division by zero

@lukas-stelzl
Copy link
Contributor

Hi @andt88,
What are the units of your bias? You might not have to take the negative log of the bias array. Let me know whether that makes a difference :).

@and-tos
Copy link
Author

and-tos commented Feb 28, 2018

I am using Amber, so the unit of the force constant should be kcal/(mol*A²) and thus that of the potential kcal/mol.

@lukas-stelzl
Copy link
Contributor

Could you provide a minimum example which shows the division by zero error? I'd love to get to the bottom of this and a minimum example for debugging would help a lot!

@and-tos
Copy link
Author

and-tos commented Mar 2, 2018

Hi Lukas, the error appears when I try to calculate the uncertainty. Here's a test case:

import numpy as np

import matplotlib.pyplot as plt
from pydhamed import count_matrix
from pydhamed import run_dhamed
import count_transitions as count_transitions


num_umbrella_windows=4

bin_list=[[25.79675, 26.041050000000002], [26.04105, 26.28535], [26.285349999999998, 26.52965], [26.52965, 26.773950000000003], [26.77395, 27.018250000000002], [27.01825, 27.26255], [27.262549999999997, 27.50685], [27.506849999999996, 27.75115], [27.75115, 27.99545], [27.995449999999998, 28.23975]]

n_states=10
bias_mat=[[2.00000000e+00 3.91720050e+00 6.47352162e+00 9.66856338e+00]
 [1.14216498e+00 2.66897408e+00 4.83480608e+00 7.63935872e+00]
 [5.23059920e-01 1.65947762e+00 3.43482050e+00 5.84888402e+00]
 [1.42684820e-01 8.88711120e-01 2.27356488e+00 4.29713928e+00]
 [1.03968000e-03 3.56674580e-01 1.35103922e+00 2.98412450e+00]
 [9.81245000e-02 6.33680000e-02 6.67243520e-01 1.90983968e+00]
 [4.33939280e-01 8.79138000e-03 2.22177780e-01 1.07428482e+00]
 [1.00848402e+00 1.92944720e-01 1.58420000e-02 4.77459920e-01]
 [1.82175872e+00 6.15828020e-01 4.82361800e-02 1.19364980e-01]
 [2.87376338e+00 1.27744128e+00 3.19360320e-01 0.00000000e+00]]

states_list=[[2, 3, 2, 3, 3], [6, 6, 5, 8, 7], [8, 9, 9, 9], [9, 9, 8]]

c_l=[]

print(bin_list)
print(n_states)
print(bias_mat)
print(states_list)

for i in states_list:
    traj=np.array(i)
    c_ = count_matrix(traj,n_states=n_states)
    c_l.append(c_)
    v_ar=bias_mat.reshape(n_states,num_umbrella_windows)
    og = run_dhamed(c_l, -np.log(v_ar), numerical_gradients=False, jit_gradient=True)
    n_blocks = 3
    bl_l = []
    for bl in np.split(traj[1:],n_blocks):
        c_l = [count_matrix(bl,n_states=9)]
        bl_g = run_dhamed(c_l, -np.log(v_ar))
        bl_l.append(np.exp(-bl_g) / np.sum(np.exp(-bl_g)))
    bl_ar = np.column_stack((bl_l))
    bl_ln_ar = np.log(bl_ar)

# minmum and maximum values of the PMF (from the five reconstructions) to indicate the uncertainty.
min_pmf = np.min( bl_ln_ar - bl_ln_ar[-1], axis=1)
max_pmf = np.max( bl_ln_ar - bl_ln_ar[-1], axis=1)
bl_ar.shape
dh_log_err = np.std(-np.log(bl_ar), axis=1) / np.sqrt(n_blocks-1)
dh_log_err.shape
fig, ax = plt.subplots(figsize=(5,3))

# error bar is barely visible
#plt.errorbar(rc, rna_pmf, yerr=dh_log_err, fmt=".", capthick=1,capsize=4
rc = np.arange(0,9)
rna_pmf = og*-1
rna_pmf -= rna_pmf[-1]
plt.plot(rc, rna_pmf, ".", label="US DHAMed")

plt.fill_between(rc, min_pmf, max_pmf, alpha=0.4, label=r"min-max PMF $\Delta t=10^4 \, \mathrm{MC}$ ")

ax.set_xlabel("Distance")
ax.set_ylabel("PMF ($\mathrm{k_BT}$)")
ax.set_ylim(-1,13)
ax.set_xlim(-0.4,8.4)
ax.set_yticks([0,5,10])
ax.set_xticks([0,2,4,6,8])
ax.tick_params(direction='out')

plt.legend(borderaxespad=0.1)
plt.tight_layout()
plt.savefig("block_pmf.pdf")```

@lukas-stelzl
Copy link
Contributor

Fantastic, I'm looking into it right now :-)

@and-tos
Copy link
Author

and-tos commented Mar 2, 2018

Hey, I think the error comes from some frames lying outside of the most left and most right bin. I fixed this by changing the bin distribution. Now I get another errror:

bin_list=[[25.741194444444446, 26.096605555555556], [26.096605555555556, 26.452016666666665], [26.45201666666667, 26.80742777777778], [26.80742777777778, 27.16283888888889], [27.16283888888889, 27.51825], [27.518250000000002, 27.87366111111111], [27.87366111111111, 28.22907222222222], [28.22907222222222, 28.58448333333333], [28.584483333333335, 28.939894444444445], [28.939894444444445, 29.295305555555554]]
n_states=10
bias_mat=[[2.00000000e+00 3.91720050e+00 6.47352162e+00 9.66856338e+00]
 [8.30989671e-01 2.18024322e+00 4.16847522e+00 6.79542786e+00]
 [1.67247574e-01 9.48554163e-01 2.36869704e+00 4.42756056e+00]
 [8.77370889e-03 2.22133342e-01 1.07418710e+00 2.56496150e+00]
 [3.55568075e-01 9.80752840e-04 2.84945393e-01 1.20763067e+00]
 [1.20763067e+00 2.85096395e-01 9.71915062e-04 3.55568075e-01]
 [2.56496150e+00 1.07448027e+00 2.22266669e-01 8.77370889e-03]
 [4.42756056e+00 2.36913237e+00 9.48829654e-01 1.67247574e-01]
 [6.79542786e+00 4.16905271e+00 2.18066087e+00 8.30989671e-01]
 [9.66856338e+00 6.47424128e+00 3.91776032e+00 2.00000000e+00]]
states_list[[2, 2, 2, 2, 2], [4, 4, 4, 5, 5], [6, 6, 7, 7, 6], [6, 7, 7, 7, 6]]
bin 0 excluded
bin 1 excluded
bin 2 excluded
bin 3 excluded
bin 4 excluded
bin 5 excluded
bin 6 excluded
bin 7 excluded
bin 8 excluded
bin 9 excluded
0
loglike-start 0.0
0.0
Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 0
         Function evaluations: 1
         Gradient evaluations: 1
time elapsed 1.19412994385 s
bin 0 excluded
bin 1 excluded
bin 2 excluded
bin 3 excluded
bin 4 excluded
bin 5 excluded
bin 6 excluded
bin 7 excluded
bin 8 excluded
0
loglike-start 0.0
0.0
Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 0
         Function evaluations: 1
         Gradient evaluations: 1
time elapsed 0.00039005279541 s
bin 0 excluded
bin 1 excluded
bin 2 excluded
bin 3 excluded
bin 4 excluded
bin 5 excluded
bin 6 excluded
bin 7 excluded
bin 8 excluded
0
loglike-start 0.0
0.0
Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 0
         Function evaluations: 1
         Gradient evaluations: 1
time elapsed 0.000334978103638 s
Traceback (most recent call last):
  File "dhamed_block.py", line 25, in <module>
    og = run_dhamed(c_l, -np.log(v_ar), numerical_gradients=False, jit_gradient=True)
  File "/storage/home/local/andtos-loc/Dropbox_2/PhD/IFN/IFN_2018_20/DHAMed/pydhamed/optimize_dhamed.py", line 162, in run_dhamed
    n_states, n_windows)
  File "/storage/home/local/andtos-loc/Dropbox_2/PhD/IFN/IFN_2018_20/DHAMed/pydhamed/prepare_dhamed.py", line 211, in generate_dhamed_input
    t= state_lifetimes_counts(c_l, n_states, n_win)
  File "/storage/home/local/andtos-loc/Dropbox_2/PhD/IFN/IFN_2018_20/DHAMed/pydhamed/prepare_dhamed.py", line 29, in state_lifetimes_counts
    t_ar[:, iwin] = np.sum(win, axis=0)
ValueError: could not broadcast input array from shape (10) into shape (9)

@lukas-stelzl
Copy link
Contributor

lukas-stelzl commented Mar 2, 2018

Thanks for the update! Two questions

1 You are loop over your umbrella windows, right?

for i in states_list:
    traj=np.array(i)
    c_ = count_matrix(traj,n_states=n_states)
    c_l.append(c_) 

The list of the count matrices grows, while you're iterating. For umbrella sampling, I'd expect one list of count matrices, with one entry for each umbrella window, for instance, for the analysis of umbrella sampling of ion permeation through the GLIC channel, I had:

len(transition_count_l) # one transition count matrix for each of the 153 umbrella windows
153

See https://github.com/bio-phys/PyDHAMed/blob/master/pydhamed/glic-ion-channel/glic_ion_channel_permeation.ipynb for reference.

I'd thus run the optimization once for all four umbrella windows, i.e., I'm not sure whether run_dhamed should be executed within the loop over states. Do let me know if I'm missing something here.

2 Are you sure you have to the take the log of your bias array?

The RNA duplex formation example is a bit misleading in that respect. There the bias returned by the simulations was w_i^a = e^(-beta u_i^a). Thus I have to take the log to get u_i^a= - ln(w_i^\a) k_BT$. Note to myself, I need to highlight this in the RNA example. In the GLIC ion channel example, I can directly input the bias u_i^a=k^a*(x_i-x_0^a)^2.

@and-tos
Copy link
Author

and-tos commented Mar 2, 2018

Thanks for your feedback! I ran an umbrella sampling to calculate the free energy of binding for a ligand to a protein.
You're right, executing run_dhamed within the loop is wrong.
Regarding the unit of v_ar, I wasn't sure which units run_dhamed actually requires. As I mentioned earlier, my v_ar is in kcal/mol.

for i in states_list:
    traj=np.array(i)
    c_ = count_matrix(traj,n_states=n_states)
    c_l.append(c_)
v_ar=bias_mat.reshape(n_states,num_umbrella_windows)
og = run_dhamed(c_l, v_ar, numerical_gradients=False, jit_gradient=True)
n_blocks = 2
bl_l = []
for bl in np.split(traj[1:],n_blocks):
        c_l = [count_matrix(bl,n_states=n_states)]
        bl_g = run_dhamed(c_l, v_ar)
        bl_l.append(np.exp(-bl_g) / np.sum(np.exp(-bl_g)))
bl_ar = np.column_stack((bl_l))
bl_ln_ar = np.log(bl_ar)
Traceback (most recent call last):
  File "dhamed_block.py", line 25, in <module>
    og = run_dhamed(c_l, v_ar, numerical_gradients=False, jit_gradient=True)
  File "/storage/home/local/andtos-loc/Dropbox_2/PhD/IFN/IFN_2018_20/DHAMed/pydhamed/optimize_dhamed.py", line 182, in run_dhamed
    numerical_gradients=numerical_gradients, **kwargs)
  File "/storage/home/local/andtos-loc/Dropbox_2/PhD/IFN/IFN_2018_20/DHAMed/pydhamed/optimize_dhamed.py", line 220, in min_dhamed_bfgs
    fprime=fprime, **kwargs)
  File "/usr/lib64/python2.7/site-packages/scipy/optimize/optimize.py", line 889, in fmin_bfgs
    res = _minimize_bfgs(f, x0, args, fprime, callback=callback, **opts)
  File "/usr/lib64/python2.7/site-packages/scipy/optimize/optimize.py", line 964, in _minimize_bfgs
    old_fval, old_old_fval, amin=1e-100, amax=1e100)
  File "/usr/lib64/python2.7/site-packages/scipy/optimize/optimize.py", line 783, in _line_search_wolfe12
    **kwargs)
  File "/usr/lib64/python2.7/site-packages/scipy/optimize/linesearch.py", line 101, in line_search_wolfe1
    c1=c1, c2=c2, amax=amax, amin=amin, xtol=xtol)
  File "/usr/lib64/python2.7/site-packages/scipy/optimize/linesearch.py", line 175, in scalar_search_wolfe1
    derphi1 = derphi(stp)
  File "/usr/lib64/python2.7/site-packages/scipy/optimize/linesearch.py", line 90, in derphi
    gval[0] = fprime(xk + s*pk, *newargs)
  File "/usr/lib64/python2.7/site-packages/scipy/optimize/optimize.py", line 292, in function_wrapper
    return function(*(wrapper_args + args))
  File "/storage/home/local/andtos-loc/Dropbox_2/PhD/IFN/IFN_2018_20/DHAMed/pydhamed/optimize_dhamed.py", line 95, in grad_dhamed_likelihood_ref_0
    grad = _loop_grad_dhamed_likelihood_0_jit(grad,g, ip, jp, ti, tj, vi, vj, nijp)
ZeroDivisionError: division by zero

@lukas-stelzl
Copy link
Contributor

I think you have to divide your bias array by RT to convert from kcal/mol to kBT. N.B., assuming that your temperature was 300 K and dividing by RT, lets run_dhamed finish, but I get non-sensical values.

Do you still get the division by zero error on your full data set?

Comparison to WHAM provides a quick test, whether your DHAMed results are sensible. You'd expect that many/some features of the WHAM PMF are reproduced by the DHAMed PMF.

@and-tos
Copy link
Author

and-tos commented Mar 2, 2018

I am currently running it on the full data set. I'll let you know once it's finished.

@and-tos
Copy link
Author

and-tos commented Mar 6, 2018

Hi Lukas I am still getting the division by zero error on the full data set. Should I send you the complete input?

@lukas-stelzl
Copy link
Contributor

Yes, this would be great! I'm very keen to get to the bottom of this problem :)

@and-tos
Copy link
Author

and-tos commented Mar 6, 2018

I attached a file that contains:
bin_list
n_states
bias_mat
states_list
dhamed.txt

@and-tos
Copy link
Author

and-tos commented Mar 8, 2018

Hi Lukas, did the file work for you? Do you prefer another format?

@kain88-de
Copy link
Contributor

The file is hard to parse for us. Could you please store the arrays using a native numpy function.

np.savez('dhamed.npz', bias_matrix=bias_matrix, bin_list=bin_list)

This is much easier for us to read again into a python process. The number of states is 1000, right?

@and-tos
Copy link
Author

and-tos commented Mar 8, 2018

Here it is. I had to label it txt to upload it but the format is npz. Thanks a lot!!
dhamed.npz.txt

@lukas-stelzl
Copy link
Contributor

Thanks a lot! I can open your file and I'm trying to figure out what's going on right now.

@lukas-stelzl
Copy link
Contributor

I managed to get DHAMed to work for your data. I made a trivial mistake in my code that affects the setup of calculations when there are empty bins/states. I'm fixing this right now and I will try to update the repository later today.

@lukas-stelzl
Copy link
Contributor

d42f2a6 should fix the problem- the problem was a trivial mistake in the handling of empty bins.

Have a look at the Jupyter notebook (had to rename it to .txt), whether the result makes sense to you.

handle_excluded_states-3.ipynb.txt

@and-tos
Copy link
Author

and-tos commented Mar 13, 2018

Thank you so much for fixing this! I am running it right now and will let you know then.

@and-tos
Copy link
Author

and-tos commented Mar 14, 2018

I'm still getting the same error. I downloaded the Master Branch yesterday. Did I get the wrong version?

@kain88-de
Copy link
Contributor

Did you update your installation? Just in case you can run pip install . --user in the repository to force an update of the installation. Alternatively, a python interpreter started from the repository directly should also pick up the current checkout.

@and-tos
Copy link
Author

and-tos commented Mar 15, 2018

I'm not sure what it is I am doing wrong but I still get the error. I cloned the master branch and ran pip install . --user as you suggested, but I am still getting the zero division error. I am running this on the same npz file I sent to you.

@lukas-stelzl
Copy link
Contributor

I'll look into that :) Thanks a lot for letting me know!

@and-tos
Copy link
Author

and-tos commented Mar 15, 2018

Is there a possibility to check whether the correct version is installed and called?

@lukas-stelzl
Copy link
Contributor

lukas-stelzl commented Mar 15, 2018

You could use ?? to for instance on generate_dhamed_input() to see whether you are calling the latest version.

For generate_dhamed_input() you'd expect to see (towards the end):
` # Remove excluded counts from the total number of transitions out of state.

nk = check_total_transition_counts(n_out, n_in, paired_ar, n_actual) `
which is new.

@and-tos
Copy link
Author

and-tos commented Mar 16, 2018

I checked thoroughly that I have the correct version. Here's the error I get (the "bin exclusion" seems new to me, so I assume this is from the changes in the code):

bin 0 excluded
bin 1 excluded
bin 2 excluded
bin 3 excluded
bin 4 excluded
bin 5 excluded
bin 6 excluded
bin 7 excluded
bin 8 excluded
bin 9 excluded
bin 10 excluded
bin 11 excluded
bin 12 excluded
bin 13 excluded
bin 14 excluded
bin 15 excluded
bin 16 excluded
bin 17 excluded
bin 18 excluded
bin 19 excluded
bin 20 excluded
bin 21 excluded
bin 22 excluded
bin 23 excluded
bin 24 excluded
bin 25 excluded
bin 26 excluded
bin 27 excluded
bin 28 excluded
bin 29 excluded
bin 31 excluded
bin 34 excluded
bin 975 excluded
bin 976 excluded
bin 977 excluded
bin 979 excluded
bin 980 excluded
bin 981 excluded
bin 982 excluded
bin 983 excluded
bin 984 excluded
bin 985 excluded
bin 986 excluded
bin 987 excluded
bin 988 excluded
bin 989 excluded
bin 990 excluded
bin 991 excluded
bin 992 excluded
bin 993 excluded
bin 994 excluded
bin 995 excluded
bin 996 excluded
bin 997 excluded
bin 998 excluded
bin 999 excluded
Number of transition pairs 169579
inf
Traceback (most recent call last):
  File "dhamed.py", line 35, in <module>
    og = run_dhamed(c_l, np.divide(v_ar, 0.00198720358509), numerical_gradients=False, jit_gradient=True)
  File "/home/local/andtos-loc/.local/lib/python2.7/site-packages/pydhamed/optimize_dhamed.py", line 219, in run_dhamed
    numerical_gradients=numerical_gradients, **kwargs)
  File "/home/local/andtos-loc/.local/lib/python2.7/site-packages/pydhamed/optimize_dhamed.py", line 257, in min_dhamed_bfgs
    fprime=fprime, **kwargs)
  File "/usr/lib64/python2.7/site-packages/scipy/optimize/optimize.py", line 889, in fmin_bfgs
    res = _minimize_bfgs(f, x0, args, fprime, callback=callback, **opts)
  File "/usr/lib64/python2.7/site-packages/scipy/optimize/optimize.py", line 943, in _minimize_bfgs
    gfk = myfprime(x0)
  File "/usr/lib64/python2.7/site-packages/scipy/optimize/optimize.py", line 292, in function_wrapper
    return function(*(wrapper_args + args))
  File "/home/local/andtos-loc/.local/lib/python2.7/site-packages/pydhamed/optimize_dhamed.py", line 128, in grad_dhamed_likelihood_ref_0
    grad = _loop_grad_dhamed_likelihood_0_jit(grad,g, ip, jp, ti, tj, vi, vj, nijp)
ZeroDivisionError: division by zero

@and-tos
Copy link
Author

and-tos commented Mar 16, 2018

I noticed that I get different results when running from the jupyter notebook than from the command line:
Command line gives this output:

...
Number of transition pairs 169579
inf
Traceback (most recent call last):
...

Jupyter Notebook this:

...
Number of transition pairs 169579
8851574.60815
Warning: Desired error not necessarily achieved due to precision loss.
         Current function value: 8850065.308126
         Iterations: 311
         Function evaluations: 428
         Gradient evaluations: 418
time elapsed 658.16526103 s

@lukas-stelzl
Copy link
Contributor

lukas-stelzl commented Mar 16, 2018

Thanks for the update :) To check, I also ran the notebook on my laptop installing the package again from GitHub

git pull origin
pip install --user -e .

Output

Number of transition pairs 169579
8851574.60815
Warning: Desired error not necessarily achieved due to precision loss.
Current function value: 8850065.308119
Iterations: 314
Function evaluations: 484
Gradient evaluations: 473
time elapsed 778.066173077 s

Check that I'm running the correct version.

prepare_dhamed.generate_dhamed_input??
# Remove excluded counts from the total number of transitions out of state.
nk = check_total_transition_counts(n_out, n_in, paired_ar, n_actual)

@and-tos
Copy link
Author

and-tos commented Mar 16, 2018

I realized that I get the error when dividing the bias potential matrix by RT (0.0019). Can you reproduce the error?

@lukas-stelzl
Copy link
Contributor

I'll try to have a look at this tmr morning :)

@lukas-stelzl
Copy link
Contributor

Where is the 0.0019 coming from? If your energies are in kcal/mol, I'd expect something like 1.98x10^-3 kcal K^-1 mol^-1 x 300 K ~ 0.596 kcal/mol for the factor to divide the bias energies. Do tell me what, I'm missing.

@and-tos
Copy link
Author

and-tos commented Mar 20, 2018 via email

@lukas-stelzl
Copy link
Contributor

Hi @andt88, can I close the issue or are there still problems?

@and-tos
Copy link
Author

and-tos commented Apr 8, 2018

Sorry I was on vacation until just now. It's working in principle but the PMF looks quite different from the one I get using a TI approach. I will have a closer look in the coming days and keep you posted. Thanks a lot for your help so far!!

@lukas-stelzl
Copy link
Contributor

Let me know how it goes :) One thing that would be interesting would be to compare the PMF from DHAMed to WHAM.

In our paper we write

In practical applications, DHAMed should be compared with (binless) WHAM. If the results differ significantly, this can mean that dynamics was relevant (and the DHAMed free energies improve over WHAM) or that the states are not kinetically meaningful (in which case neither DHAM nor WHAM free energies are likely to be accurate).

@lukas-stelzl lukas-stelzl reopened this Apr 10, 2018
@and-tos
Copy link
Author

and-tos commented Apr 13, 2018

Hi Lukas, thanks for reopening this!
Attached you can find the results from my TI calculation (accumulative work) and the DHAMed result from your method. Interestingly, DHAMed shows a lot of scattering at the borders, the overall shape of the curves is however quite similar, just slightly shifted along the x-axis.
compare_ti_dhamed

@lukas-stelzl
Copy link
Contributor

lukas-stelzl commented Apr 16, 2018

That looks pretty good to me, thanks for the update!

In recent (unpublished) work, I have also found scatter at the ends of DHAMed profiles. I found that all the structures outside the studied range of the reaction coordinate/order parameter to the would be mapped to the structural states/bins with the smallest and largest value of the order parameter. To fix that I added one more state on either end of the PMF, included the state in the DHAMed or WHAM calculations, but left out the state in the final plot (plotting over just the range of interest).

@and-tos
Copy link
Author

and-tos commented Apr 17, 2018

I tried following your advice, but I'm not sure what it is I am doing wrong.
My reaction coordinate goes from 26 to 49 Angstrom.
If I defined my bins from 26 to 49 Angstrom and I had a structure at 25 Angstrom, this would lead to the observed scattering, correct?

I therefore defined the bins between 21 to 54 Angstrom. As expected, each structure is now assigned to a bin. However the scattering at the edges of the plot remains unchanged.

@lukas-stelzl
Copy link
Contributor

Hm, I need to think about this in more detail. At the edge of your reaction coordinate you may have fewer events, which may be one reason for the scattering.

@and-tos
Copy link
Author

and-tos commented Apr 18, 2018

I made different plots with changing padding at the edges and the number of bins. This affects the scattering and the shift along the x-axis and the energy.
pmf_pad_n_states

@and-tos
Copy link
Author

and-tos commented Apr 19, 2018

Here's the comparison for TI vs DHAMed with pad=0 and n_states=50
compare_ti_dhamed

@lukas-stelzl
Copy link
Contributor

Thank a lot for your update! That looks very good :) It seems that the padding did not help in your case, which is interesting for the further development of DHAMed.

@and-tos
Copy link
Author

and-tos commented Apr 19, 2018

Thanks a lot to you and @kain88-de for your help!!

@and-tos
Copy link
Author

and-tos commented May 11, 2018

Hi there again :)
Before I try to implement it myself, I wanted to ask if pyDHAMed already includes a function to calculate kinetic rate constants and statistical uncertainties.
Regarding the statistical uncertainty, I already saw your blocking approach, but my understanding from the publication is that there's a way to calculate a covariance matrix from F.
I look forward to your response!

@lukas-stelzl
Copy link
Contributor

Hi, I'm sorry for the slow response. pyDHAMed does not include such functions yet, but I'm working on including them.

  1. I will have quick look at calculating the covariance matrix from F next week Estimate uncertainty #12.
  2. Calculating kinetics from e.g., umbrella sampling is tricky. I will think about a strategy that might work for you and others and what functions should be included in the package. Kinetics calculations #13.

@and-tos
Copy link
Author

and-tos commented May 24, 2018

Thanks a lot! Let me know if I can help with testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants