Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error calculating dG in regions < 100. Potential Solution #217

Open
ghost opened this issue Apr 14, 2024 · 0 comments
Open

Error calculating dG in regions < 100. Potential Solution #217

ghost opened this issue Apr 14, 2024 · 0 comments

Comments

@ghost
Copy link

ghost commented Apr 14, 2024

I have encountered an error when attempting to genotype smaller regions (<100bp) obtained from another source which causes the program to stop.
The error in question is as follows.

  File "/scale_wlg_nobackup/filesets/nobackup/nesi00187/tonug0/cnv_detection/CNVpytor/venv/bin/cnvpytor", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/scale_wlg_nobackup/filesets/nobackup/nesi00187/tonug0/cnv_detection/CNVpytor/venv/lib/python3.11/site-packages/cnvpytor/__main__.py", line 242, in main
    view.genotype_prompt(list(map(binsize_type, args.genotype)), all=args.all)
  File "/scale_wlg_nobackup/filesets/nobackup/nesi00187/tonug0/cnv_detection/CNVpytor/venv/lib/python3.11/site-packages/cnvpytor/viewer.py", line 4290, in genotype_prompt
    self.genotype_all(bin_sizes, [line], interactive=True)
  File "/scale_wlg_nobackup/filesets/nobackup/nesi00187/tonug0/cnv_detection/CNVpytor/venv/lib/python3.11/site-packages/cnvpytor/viewer.py", line 4153, in genotype_all
    dG = np.min(distN[sbin1:sbin2])
         ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scale_wlg_nobackup/filesets/nobackup/nesi00187/tonug0/cnv_detection/CNVpytor/venv/lib/python3.11/site-packages/numpy/core/fromnumeric.py", line 2953, in min
    return _wrapreduction(a, np.minimum, 'min', axis, None, out,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scale_wlg_nobackup/filesets/nobackup/nesi00187/tonug0/cnv_detection/CNVpytor/venv/lib/python3.11/site-packages/numpy/core/fromnumeric.py", line 88, in _wrapreduction
    return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: zero-size array to reduction operation minimum which has no identity

I have determined that this error occurs when the start and end of the provided region are within the same 100bp (e.g. 120-178 but not 180-257). This is caused sbin1 and sbin2 (viewer.py line 4124) resulting in the same number, thus dG = np.min(distN[sbin1:sbin2]) (viewer.py line 4150) returns an empty list for the np.min function. A potential solution for this is to check whether sbin1==sbin2, if so get distN at that position.

Potential change (viewer.py line 4148):

                    pN = 1 - pN / (pos2 - pos1 + 1)
                    if (sbin1 == sbin2):
                        dG = distN[sbin1]
                    else:
                        dG = np.min(distN[sbin1:sbin2])

However, this made me wonder about the case in which the region only intersects 2 bins (e.g. 180-257 so that sbin2 == sbin1 +1). In this case, distN[sbin1:sbin2] would only return the distance for sbin1 which doesn't sound correct as we would want to know the minimum distance out of sbin1 and sbin2. Therefore, would it be more suitable to add 1 to sbin2 so that the last bin is always included in the minimum distance check? This would also fix the issue for regions within the same 100bp without needed to check sbin1 == sbin2.

Potential change (viewer.py line 4148):

                    pN = 1 - pN / (pos2 - pos1 + 1)
                    dG = np.min(distN[sbin1:sbin2+1])

What would be the best solution to deal with these regions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

0 participants