Skip to content

Commit

Permalink
Merge pull request #28 from perib/dev
Browse files Browse the repository at this point in the history
Dev
  • Loading branch information
perib authored Jul 18, 2023
2 parents f98665f + a58f6fe commit 75e7326
Show file tree
Hide file tree
Showing 4 changed files with 8 additions and 8 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
TPOT stands for Tree-based Pipeline Optimization Tool. TPOT2 is a Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming. Consider TPOT2 your Data Science Assistant.

TPOT2 is a rewrite of TPOT with some additional functionality. Notably, we added support for graph-based pipelines and additional parameters to better specify the desired search space.
TPOT2 is currently in Alpha. This means that there will likely be some backwards incompatible changes to the API as we develop. Some implemented features may be buggy. There is a list of known issues written at the bottom of this README. Some features have placeholder names or are listed as "Experimental" in the doc string. These are features that may not be fully implemented and may or may work with all other features.
TPOT2 is currently in Alpha. This means that there will likely be some backwards incompatible changes to the API as we develop. Some implemented features may be buggy. There is a list of known issues written at the bottom of this README. Some features have placeholder names or are listed as "Experimental" in the doc string. These are features that may not be fully implemented and may or may not work with all other features.

If you are interested in using the current stable release of TPOT, you can do that here: [https://github.com/EpistasisLab/tpot/](https://github.com/EpistasisLab/tpot/).

Expand Down Expand Up @@ -136,7 +136,7 @@ Setting `verbose` to 5 can be helpful during debugging as it will print out the

## Contributing to TPOT2

We welcome you to check the existing issues for bugs or enhancements to work on. If you have an idea for an extension to TPOT, please file a new issue so we can discuss it.
We welcome you to check the existing issues for bugs or enhancements to work on. If you have an idea for an extension to TPOT2, please file a new issue so we can discuss it.


### Known issues
Expand Down
8 changes: 4 additions & 4 deletions Tutorial/6_SH_and_early_termination.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -73,8 +73,8 @@
"fig, ax1 = plt.subplots()\n",
"ax2 = ax1.twinx()\n",
"\n",
"interpolated_values_population = tpot2.beta_interpolation(start=initial_population_size, end=population_size, n=generations_until_end_population, n_steps=stepwise_steps, scale=population_scaling)\n",
"interpolated_values_budget = tpot2.beta_interpolation(start=budget_range[0], end=budget_range[1], n=generations_until_end_budget, n_steps=stepwise_steps, scale=budget_scaling)\n",
"interpolated_values_population = tpot2.utils.beta_interpolation(start=initial_population_size, end=population_size, n=generations_until_end_population, n_steps=stepwise_steps, scale=population_scaling)\n",
"interpolated_values_budget = tpot2.utils.beta_interpolation(start=budget_range[0], end=budget_range[1], n=generations_until_end_budget, n_steps=stepwise_steps, scale=budget_scaling)\n",
"ax1.step(list(range(len(interpolated_values_population))), interpolated_values_population, label=f\"population size\")\n",
"ax2.step(list(range(len(interpolated_values_budget))), interpolated_values_budget, label=f\"budget\", color='r')\n",
"ax1.set_xlabel(\"generation\")\n",
Expand Down Expand Up @@ -274,7 +274,7 @@
"#Population and budget use stepwise\n",
"fig, ax1 = plt.subplots()\n",
"\n",
"interpolated_values = tpot2.beta_interpolation(start=threshold_evaluation_early_stop[0], end=threshold_evaluation_early_stop[-1], n=cv, n_steps=cv, scale=threshold_evaluation_scaling)\n",
"interpolated_values = tpot2.utils.beta_interpolation(start=threshold_evaluation_early_stop[0], end=threshold_evaluation_early_stop[-1], n=cv, n_steps=cv, scale=threshold_evaluation_scaling)\n",
"ax1.step(list(range(len(interpolated_values))), interpolated_values, label=f\"threshold\")\n",
"ax1.set_xlabel(\"fold\")\n",
"ax1.set_ylabel(\"percentile\")\n",
Expand Down Expand Up @@ -347,7 +347,7 @@
"#Population and budget use stepwise\n",
"fig, ax1 = plt.subplots()\n",
"\n",
"interpolated_values = tpot2.beta_interpolation(start=selection_evaluation_early_stop[0], end=selection_evaluation_early_stop[-1], n=cv, n_steps=cv, scale=selection_evaluation_scaling)\n",
"interpolated_values = tpot2.utils.beta_interpolation(start=selection_evaluation_early_stop[0], end=selection_evaluation_early_stop[-1], n=cv, n_steps=cv, scale=selection_evaluation_scaling)\n",
"ax1.step(list(range(len(interpolated_values))), interpolated_values, label=f\"threshold\")\n",
"ax1.set_xlabel(\"fold\")\n",
"ax1.set_ylabel(\"percent to select\")\n",
Expand Down
2 changes: 1 addition & 1 deletion tpot2/evolvers/base_evolver.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ def __init__( self,
A dask client to use for parallelization. If not None, this will override the n_jobs and memory_limit parameters. If None, will create a new client with num_workers=n_jobs and memory_limit=memory_limit.
survival_percentage : float, default=1
Percentage of the population size to utilize for mutation and crossover at the beginning of the generation. The rest are discarded. Individuals are selected with the selector passed into survival_selector. The value of this parameter must be between 0 and 1, inclusive.
For example, if the population size is 100 and the survival percentage is .5, 50 individuals will be selected with NSGA2 from the existing population. These will be used for mutation and crossover to generate the next 100 individuals for the next generation. The remainder are discarded from the live population. In the next generation, there will now be the 50 parents + the 100 individuals for a total of 150. Surivival percentage is based of the population size parameter and not the existing population size. Therefore, in the next generation we will still select 50 individuals from the currently existing 150.
For example, if the population size is 100 and the survival percentage is .5, 50 individuals will be selected with NSGA2 from the existing population. These will be used for mutation and crossover to generate the next 100 individuals for the next generation. The remainder are discarded from the live population. In the next generation, there will now be the 50 parents + the 100 individuals for a total of 150. Surivival percentage is based of the population size parameter and not the existing population size (current population size when using successive halving). Therefore, in the next generation we will still select 50 individuals from the currently existing 150.
crossover_probability : float, default=.2
Probability of generating a new individual by crossover between two individuals.
mutate_probability : float, default=.7
Expand Down
2 changes: 1 addition & 1 deletion tpot2/tpot_estimator/estimator.py
Original file line number Diff line number Diff line change
Expand Up @@ -319,7 +319,7 @@ def __init__(self, scorers,
survival_percentage : float, default=1
Percentage of the population size to utilize for mutation and crossover at the beginning of the generation. The rest are discarded. Individuals are selected with the selector passed into survival_selector. The value of this parameter must be between 0 and 1, inclusive.
For example, if the population size is 100 and the survival percentage is .5, 50 individuals will be selected with NSGA2 from the existing population. These will be used for mutation and crossover to generate the next 100 individuals for the next generation. The remainder are discarded from the live population. In the next generation, there will now be the 50 parents + the 100 individuals for a total of 150. Surivival percentage is based of the population size parameter and not the existing population size. Therefore, in the next generation we will still select 50 individuals from the currently existing 150.
For example, if the population size is 100 and the survival percentage is .5, 50 individuals will be selected with NSGA2 from the existing population. These will be used for mutation and crossover to generate the next 100 individuals for the next generation. The remainder are discarded from the live population. In the next generation, there will now be the 50 parents + the 100 individuals for a total of 150. Surivival percentage is based of the population size parameter and not the existing population size (current population size when using successive halving). Therefore, in the next generation we will still select 50 individuals from the currently existing 150.
crossover_probability : float, default=.2
Probability of generating a new individual by crossover between two individuals.
Expand Down

0 comments on commit 75e7326

Please sign in to comment.