Shallow defect analysis #50

kavanase · 2024-02-15T12:31:02Z

Quoting @adair-nicolson's initial overview:

Added the ability to easily use pydefect to identify shallow defects. When parsing a single defect using DefectParser.from_paths(), if the user adds the tag load_phs_data=True additional data is loaded from the vasprun.xml i.e. projected eigenvalues. Then using dp.get_perturbed_host_state() information about the defect level is returned, including an automated attempt at PHS identification. Additionally a plot of the single-particle levels is generated for added verification. Added an example and some additional information to the tips and tricks section of the docs. I added the majority of the functions to the file doped.utils.phs.py to keep them out of the way and to allow easy changes to be made if needed when pydefect/vise are updated. Starting a draft pull request so can get feedback on if you would want me to rearrange some stuff, plus get those comments while I polish up the final bit.

Example:

Shallow defect analysis

kavanase · 2024-02-15T12:32:31Z

And my comments from previous PR:
This looks really really nice @adair-nicolson, thanks very much for pushing forward with this! 😃
(Apologies for the delay in getting to this, wanted to push forward on getting v2.3 out first as it was so close to being done, and inevitably takes much longer than expected... 🥲)

For the PHS plot, could we use the doped/displacements mplstyle file (in doped.utils) by default, but with a user option of style_file to customise this (see DefectThermodynamics.plot()/_plot_site_displacements()/get_kumagai_correction() etc)?
- Also, could we add a legend that says blue/red is unoccupied/occupied, and that the dashed lines are the band edges? (If it was easy, could also be nice to change the blue/red dots to orange/blue to match sumo/doped CBM/VBM colours, but no worries if not)
For this part of the code, can it use the previously-loaded vaspruns (and OUTCAR if OUTCAR was previously loaded) for efficiency when parsing?

bulk_outcar_path, multiple = _get_output_files_and_check_if_multiple(
    "OUTCAR", dp.defect_entry.calculation_metadata["bulk_path"]
)
bulk_outcar_phs = get_outcar(bulk_outcar_path)
bulk_vr_phs = get_vasprun(bulk_vr_path, parse_projected_eigen=True)
defect_vr_phs = get_vasprun(defect_vr_path, parse_projected_eigen=True)

On a related point, assuming no reason not to (like it being very slow or anything), maybe it would be best to define load_phs_data: Optional[bool] = None as the default, where if load_phs_data is None or True, it tries to load and parse the PHS data, but if it fails, it only errors out / warns the user if load_phs_data is True (i.e. if the user explicitly set it)? I guess would be useful to have this data already parsed from the one parsing run if possible, as I guess users also usually don't know beforehand if a certain state is likely to be a shallow state or not, until they see the TLD plot. This can be implemented with a try except catch
- Related point: Can we also add load_phs_data as an option to DefectsParser that is passed on to DefectParser.from_paths() when parsing multiple defects?
For the test (thanks for adding!), can we add a plotting comparison test too? See the custom_mpl_image_compare tests in the current tests for this; you just get the test function to return the figure object, then can generate the test plot with pytest -k "phs_Cu2SiSe3" --mpl-generate-path="baseline" test_analysis.py, and test with pytest -k "phs_Cu2SiSe3 --mpl test_analysis.py; see https://github.com/matplotlib/pytest-mpl

~~Lastly, it'll be useful to have this draft PR as a SMTG/doped branch rather than a local branch, just to make it easier to dig into the code etc – I'll see if I can do this now!~~

kavanase · 2024-02-15T16:16:58Z

Does this code also allow IPR analysis @adair-nicolson? (Can't remember what 'P ratio' is here)

adair-nicolson · 2024-02-27T09:58:21Z

Does this code also allow IPR analysis @adair-nicolson? (Can't remember what 'P ratio' is here)

So P-ratio is a similar-ish metric. P-ratio (participation ratio) is the ratio of the projected orbitals of the neighbouring atoms to the defects site to the sum of all atoms. So for a delocalised states/band states you would expect a small value. But there was a little bug, which is the last thing I need to fix today, related to setting the number of nearest neighbour. I'll also add an explainer to the tips and tricks section on PHS analysis.

adair-nicolson · 2024-02-27T21:04:58Z

@kavanase I'm going to set load_phs_data: Optional[bool] = False just to avoid having to redo all of the test cases in test_analysis.py. But can work slowly on updating them all afterwards, and then change the default behaviour in a future update

kavanase · 2024-02-28T02:29:32Z

Ok! Which part is it breaking in the analysis tests? (Shouldn't affect other properties being tested?)
I can update/re-parse any test data that might be needed

adair-nicolson · 2024-02-28T19:40:56Z

Have got round the failed tests by settings the code to skip the trying to do the shallow defect analysis if you don't have the rights files/vise version and returns a warning rather that raising an error. E.g.

v_vise = version("vise")
if v_vise <= "0.8.1":
warnings.warn(
                f"You have version {v_vise} of the package `vise`,"
                f" which does not allow the parsing of non-collinear calculations."
                f" You can install the updated version of `vise` from the GitHub repo for this"
                f" functionality. Attempting to load the PHS data has been automatically skipped"
            )

so should be able to re-enable the shallow defect analysis as the default behaviour

# Conflicts: # doped/analysis.py

kavanase · 2024-03-01T21:58:13Z

Looking nice @adair-nicolson! When this gets to the near-completion stage, could you add a short section to the advanced analysis tutorial showing it in action? Thanks!

…t reordering)

…ct` `INFO` messages

…ith dynamic handling and warning if needs be

…ct_path` so later loading of data works even if notebook moved

kavanase · 2024-03-28T21:30:20Z

Btw @adair-nicolson, I updated orbital_diff to be normalised in those changes as this seems like the most robust way of doing the orbital comparisons (so either way the orb_diff in those JSON files needs to be updated), but maybe there's a reason not to do this?

kavanase · 2024-03-28T21:41:49Z

Ok once that orb_diff issue is sorted and tests pass, then this is ready to merge!

adair-nicolson · 2024-03-29T10:41:03Z

@kavanase I'm still pretty sure it's due to the difference in significant figures. Although it's only the third decimal place, that's for each atom, so if you add up all the project eigenvalues for a single band that can make up a sizeable difference, especially in a very large supercell.

If you replace the project eigenvalues loaded with Vasprun with those loaded with Procar, you get the same result, showing it's not something related to the difference in how band edge states is calculated depending on the files you have saved, but is related to the initial input data.

kavanase · 2024-03-29T13:52:41Z

Ok! Can you update the reference JSONs for this then?
Before it was passing for the PROCAR parsing but not the vaspruns. If this is the issue, is there some easy easy of fixing?

…and avoid logging info messages

…as speed mostly the same as PROCAR parsing now, and more accurate/easier

This reverts commit eaa8478.

…ng of DOS

…count for small numerical differences

kavanase · 2024-03-30T17:35:52Z

@adair-nicolson ok so I dug in and saw the significant figures issue you were talking about (I didn't really get the screenshot at first as it seemed it was just replacing the Vasprun data with the PROCAR data gives the same result as the PROCARs, but I got your point from looking closer); I forgot the PROCAR is only 3 decimal places which isn't much in this context and can make a significant difference as you say!

So I added some updates there, just to make the vasprun parsing as efficient as possible (now about 30% quicker overall), and to update the default behaviour to parse the projected eigenvalue data from the vaspruns preferentially rather than PROCARs, as it's more accurate with the 4 decimal places, more convenient for the user, and from testing over all test_analysis.py cases, seems to come out only ~5% slower than PROCAR parsing now. Also updated the eigenvalue test to use the one JSON file for vasprun/PROCAR parsed data and allow for small numerical differences, rather than using fixed individual JSON reference files. Seems to all be passing locally for me now, so will merge this once the tests above pass! 😃

kavanase · 2024-03-30T22:47:42Z

All tests passing, merging now. Thanks for all your work with this @adair-nicolson! 💪
Will post a Slack PSA about this during the week too, so ppl know this is now available

adair-nicolson and others added 11 commits February 1, 2024 20:04

added phs identification

2eca6e6

Merge remote-tracking branch 'origin/develop' into develop

e0584a8

added to tips and tricks

7a6c2c0

added test

10d6679

uncommented warning

611c246

fix circular import

f6da270

fix invalid escape sequence

189c741

fix typo

058b0b7

add reference to pydefect and vise

51d553d

Merge branch 'shallow_defects' into develop

a6af4d2

Merge pull request #47 from adair-nicolson/develop

13712e8

Shallow defect analysis

kavanase added the enhancement New feature or request label Feb 15, 2024

kavanase assigned adair-nicolson Feb 15, 2024

kavanase marked this pull request as draft February 15, 2024 12:31

adair-nicolson added 8 commits March 1, 2024 16:13

Merge branch 'main' into shallow_defects

c502e3b

# Conflicts: # doped/analysis.py

Updated plotting, fixed p-ratio bug, updated vise check

4316232

New load data with additional warnings

7d2c69e

Add load_phs_data to DefectsParser

799c5d0

Improved efficiency load data

02c9819

Fix bes test

7f68f49

Fix missing Gamma glyph when plotting

00ac201

Supress warning missing Gamma glyph when plotting

48a5e38

Update to tips and tricks

0b00716

kavanase added 9 commits March 28, 2024 13:16

Fix minor issue in displacements.py (wasn't accounting for disp dic…

0ac71aa

…t reordering)

Add from_dict method for DefectEntry to avoid unnecessary `pydefe…

a55f924

…ct` `INFO` messages

gzip PROCARs and update tests now new easyunfold released

026d371

Update default orb and energy similarity criteria from testing, w…

c0c68ca

…ith dynamic handling and warning if needs be

Minor docstrings updates, and use absolute paths in bulk_path/`defe…

90979e2

…ct_path` so later loading of data works even if notebook moved

Add orb/energy criterion customisation tests

707b837

Update tips page

53ca0fe

Update advanced tutorial

97adc43

Update plotting customisation tutorial

484235b

Update changelog and version number prior to release

a55859b

Minor paper updates

b3feb76

kavanase and others added 9 commits March 29, 2024 16:24

Minor updates

1d4e725

Analysis tests cleanup

5b94405

Make vasprun projected eigenvalue parsing marginally more efficient, …

0572c09

…and avoid logging info messages

Update default parsing behaviour; parse from vaspruns preferentially …

b531125

…as speed mostly the same as PROCAR parsing now, and more accurate/easier

Update tests (mostly there...)

fff7eaf

Updated test for some file combinations.

eaa8478

Revert "Updated test for some file combinations."

9ab2b03

This reverts commit eaa8478.

Make vasprun parsing a bit more efficient, avoiding unnecessary parsi…

54c3da4

…ng of DOS

Update eigenvalue tests, use same vasprun.xml/PROCAR JSONs but ac…

9ffb1bc

…count for small numerical differences

kavanase added 2 commits March 30, 2024 18:07

Reduce testing redundancy

561779e

Minor TODO and CHANGELOG updates

8b78fde

kavanase merged commit 0c72649 into develop Mar 30, 2024
0 of 8 checks passed

kavanase deleted the shallow_defects branch March 30, 2024 22:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shallow defect analysis #50

Shallow defect analysis #50

kavanase commented Feb 15, 2024

kavanase commented Feb 15, 2024 •

edited by adair-nicolson

Loading

kavanase commented Feb 15, 2024

adair-nicolson commented Feb 27, 2024 •

edited

Loading

adair-nicolson commented Feb 27, 2024 •

edited

Loading

kavanase commented Feb 28, 2024 •

edited

Loading

adair-nicolson commented Feb 28, 2024

kavanase commented Mar 1, 2024

kavanase commented Mar 28, 2024

kavanase commented Mar 28, 2024

adair-nicolson commented Mar 29, 2024

kavanase commented Mar 29, 2024

kavanase commented Mar 30, 2024

kavanase commented Mar 30, 2024

Shallow defect analysis #50

Shallow defect analysis #50

Conversation

kavanase commented Feb 15, 2024

kavanase commented Feb 15, 2024 • edited by adair-nicolson Loading

kavanase commented Feb 15, 2024

adair-nicolson commented Feb 27, 2024 • edited Loading

adair-nicolson commented Feb 27, 2024 • edited Loading

kavanase commented Feb 28, 2024 • edited Loading

adair-nicolson commented Feb 28, 2024

kavanase commented Mar 1, 2024

kavanase commented Mar 28, 2024

kavanase commented Mar 28, 2024

adair-nicolson commented Mar 29, 2024

kavanase commented Mar 29, 2024

kavanase commented Mar 30, 2024

kavanase commented Mar 30, 2024

kavanase commented Feb 15, 2024 •

edited by adair-nicolson

Loading

adair-nicolson commented Feb 27, 2024 •

edited

Loading

adair-nicolson commented Feb 27, 2024 •

edited

Loading

kavanase commented Feb 28, 2024 •

edited

Loading