Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial Python Version of Genotyping Script-Not Tested Yet #609

Conversation

shadizaheri
Copy link
Contributor

Initial Python Version of Genotyping Script
Description:
This Pull Request was created in response to issue #403. This pull request introduces the initial Python version of our genotyping script. Previously, this logic was implemented in Bash, and this PR aims to transition that logic to a more maintainable and readable Python format.

Please note: I have not thoroughly checked or tested this script. This submission is intended to serve as a starting point for further refinement and optimization. Feedback, suggestions, and thorough reviews are highly encouraged to ensure the quality and functionality of the code.

Additional Note:
I have divided the original Bash script into sections to facilitate the transition from the Bash script to Python. I've added the corresponding line numbers from the Bash script as comments within the Python script for reference and easier tracking. This should aid in understanding the structure and mapping the Python code back to its Bash counterpart.

Rooms to improve:

  • External Command Sanitization: Please ensure that all inputs to os.system() calls are sanitized and validated. This is crucial to prevent potential command injection vulnerabilities.

  • Variable Initialization: The comments in the script mention that certain variables (like GTDIR and cleaned_output_vcf) should be defined elsewhere. Ensure these variables are initialized correctly in the relevant parts of the code.

  • Code Repetition: The current version has repeated lines, such as prepare_sample_lists(args.FAMFILE, GTDIR) and setup_genotype_counts_header(GTDIR).

  • Code Optimization: To optimize the logic and methods in the script.

@mwalker174
Copy link
Collaborator

Thank you @shadizaheri this was a lot of work. I'm going to close this since we have an optimized reimplementation in #614 that's in production now.

@mwalker174 mwalker174 closed this Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants