Skip to content

Commit

Permalink
Merge pull request #121 from jgregoriods/feature/sentence-length
Browse files Browse the repository at this point in the history
add sentence length distribution plot
  • Loading branch information
meghdadFar authored Nov 17, 2023
2 parents f9a9e9d + be74bd0 commit cda571b
Show file tree
Hide file tree
Showing 29 changed files with 144 additions and 685 deletions.
Binary file modified docs/.doctrees/anomalies.doctree
Binary file not shown.
Binary file modified docs/.doctrees/api.doctree
Binary file not shown.
Binary file modified docs/.doctrees/bias.doctree
Binary file not shown.
Binary file modified docs/.doctrees/clustering.doctree
Binary file not shown.
Binary file modified docs/.doctrees/codeofconduct.doctree
Binary file not shown.
Binary file modified docs/.doctrees/contributing.doctree
Binary file not shown.
Binary file modified docs/.doctrees/environment.pickle
Binary file not shown.
Binary file modified docs/.doctrees/index.doctree
Binary file not shown.
Binary file modified docs/.doctrees/labels.doctree
Binary file not shown.
Binary file modified docs/.doctrees/mwes.doctree
Binary file not shown.
Binary file modified docs/.doctrees/start.doctree
Binary file not shown.
Binary file modified docs/.doctrees/textstats.doctree
Binary file not shown.
Binary file modified docs/.doctrees/utilities.doctree
Binary file not shown.
Binary file added docs/_images/sentencelen.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
16 changes: 13 additions & 3 deletions docs/_sources/textstats.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -52,15 +52,24 @@ when you carry out mini-batch training.

.. code:: python
ta.show_distplot(plot='doc_len')
ta.show_distplot(distribution='doc_len')
|doclen|

You can also see the distribution of sentence lengths to make better
decisions about semantic composition functions and sentence embeddings.

.. code:: python
ta.show_distplot(distribution='sentence_len')
|sentencelen|

Or, you can see the Zipf distribution of words:

.. code:: python
ta.show_distplot(plot='word_frequency_zipf')
ta.show_distplot(distribution='word_frequency_zipf')
|wordszipf|

Expand Down Expand Up @@ -89,4 +98,5 @@ use the ``show_word_clouds`` method.
.. |nouns| image:: ../figs/nouns.png
.. |adjs| image:: ../figs/adjectives.png
.. |doclen| image:: ../figs/doclen.png
.. |wordszipf| image:: ../figs/wordszipf.png
.. |wordszipf| image:: ../figs/wordszipf.png
.. |sentencelen| image:: ../figs/sentencelen.png
2 changes: 1 addition & 1 deletion docs/_static/css/theme.css

Large diffs are not rendered by default.

497 changes: 11 additions & 486 deletions docs/api.html

Large diffs are not rendered by default.

18 changes: 9 additions & 9 deletions docs/bias.html
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@
<script src="_static/js/theme.js"></script>
<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="next" title="Anomalies &amp; Outliers" href="anomalies.html" />
<link rel="prev" title="Multiword Expressions (MWEs)" href="mwes.html" />
<link rel="next" title="Analysis of Anomalies &amp; Outliers" href="anomalies.html" />
<link rel="prev" title="Analysis &amp; Extraction of Multiword Expressions (MWEs)" href="mwes.html" />
</head>

<body class="wy-body-for-nav">
Expand Down Expand Up @@ -52,12 +52,12 @@
</ul>
<p class="caption" role="heading"><span class="caption-text">Exploratory Data Analysis (EDA):</span></p>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="textstats.html">Text Stats</a></li>
<li class="toctree-l1"><a class="reference internal" href="labels.html">Label Stats</a></li>
<li class="toctree-l1"><a class="reference internal" href="mwes.html">Multiword Expressions (MWEs)</a></li>
<li class="toctree-l1"><a class="reference internal" href="textstats.html">Text Analysis</a></li>
<li class="toctree-l1"><a class="reference internal" href="labels.html">Label Analysis</a></li>
<li class="toctree-l1"><a class="reference internal" href="mwes.html">Analysis &amp; Extraction of Multiword Expressions (MWEs)</a></li>
<li class="toctree-l1 current"><a class="current reference internal" href="#">Bias Analysis</a></li>
<li class="toctree-l1"><a class="reference internal" href="anomalies.html">Anomalies &amp; Outliers</a></li>
<li class="toctree-l1"><a class="reference internal" href="clustering.html">Clustering</a></li>
<li class="toctree-l1"><a class="reference internal" href="anomalies.html">Analysis of Anomalies &amp; Outliers</a></li>
<li class="toctree-l1"><a class="reference internal" href="clustering.html">Cluster Analysis</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">Utilities:</span></p>
<ul>
Expand Down Expand Up @@ -211,8 +211,8 @@ <h1>Bias Analysis<a class="headerlink" href="#bias-analysis" title="Permalink to
</div>
</div>
<footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
<a href="mwes.html" class="btn btn-neutral float-left" title="Multiword Expressions (MWEs)" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="anomalies.html" class="btn btn-neutral float-right" title="Anomalies &amp; Outliers" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
<a href="mwes.html" class="btn btn-neutral float-left" title="Analysis &amp; Extraction of Multiword Expressions (MWEs)" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="anomalies.html" class="btn btn-neutral float-right" title="Analysis of Anomalies &amp; Outliers" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
</div>

<hr/>
Expand Down
10 changes: 5 additions & 5 deletions docs/codeofconduct.html
Original file line number Diff line number Diff line change
Expand Up @@ -52,12 +52,12 @@
</ul>
<p class="caption" role="heading"><span class="caption-text">Exploratory Data Analysis (EDA):</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="textstats.html">Text Stats</a></li>
<li class="toctree-l1"><a class="reference internal" href="labels.html">Label Stats</a></li>
<li class="toctree-l1"><a class="reference internal" href="mwes.html">Multiword Expressions (MWEs)</a></li>
<li class="toctree-l1"><a class="reference internal" href="textstats.html">Text Analysis</a></li>
<li class="toctree-l1"><a class="reference internal" href="labels.html">Label Analysis</a></li>
<li class="toctree-l1"><a class="reference internal" href="mwes.html">Analysis &amp; Extraction of Multiword Expressions (MWEs)</a></li>
<li class="toctree-l1"><a class="reference internal" href="bias.html">Bias Analysis</a></li>
<li class="toctree-l1"><a class="reference internal" href="anomalies.html">Anomalies &amp; Outliers</a></li>
<li class="toctree-l1"><a class="reference internal" href="clustering.html">Clustering</a></li>
<li class="toctree-l1"><a class="reference internal" href="anomalies.html">Analysis of Anomalies &amp; Outliers</a></li>
<li class="toctree-l1"><a class="reference internal" href="clustering.html">Cluster Analysis</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">Utilities:</span></p>
<ul>
Expand Down
78 changes: 48 additions & 30 deletions docs/contributing.html
Original file line number Diff line number Diff line change
Expand Up @@ -52,12 +52,12 @@
</ul>
<p class="caption" role="heading"><span class="caption-text">Exploratory Data Analysis (EDA):</span></p>
<ul>
<li class="toctree-l1"><a class="reference internal" href="textstats.html">Text Stats</a></li>
<li class="toctree-l1"><a class="reference internal" href="labels.html">Label Stats</a></li>
<li class="toctree-l1"><a class="reference internal" href="mwes.html">Multiword Expressions (MWEs)</a></li>
<li class="toctree-l1"><a class="reference internal" href="textstats.html">Text Analysis</a></li>
<li class="toctree-l1"><a class="reference internal" href="labels.html">Label Analysis</a></li>
<li class="toctree-l1"><a class="reference internal" href="mwes.html">Analysis &amp; Extraction of Multiword Expressions (MWEs)</a></li>
<li class="toctree-l1"><a class="reference internal" href="bias.html">Bias Analysis</a></li>
<li class="toctree-l1"><a class="reference internal" href="anomalies.html">Anomalies &amp; Outliers</a></li>
<li class="toctree-l1"><a class="reference internal" href="clustering.html">Clustering</a></li>
<li class="toctree-l1"><a class="reference internal" href="anomalies.html">Analysis of Anomalies &amp; Outliers</a></li>
<li class="toctree-l1"><a class="reference internal" href="clustering.html">Cluster Analysis</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">Utilities:</span></p>
<ul>
Expand All @@ -66,8 +66,9 @@
<p class="caption" role="heading"><span class="caption-text">Contributing:</span></p>
<ul class="current">
<li class="toctree-l1 current"><a class="current reference internal" href="#">Contributing to wordview</a><ul>
<li class="toctree-l2"><a class="reference internal" href="#getting-started">Getting Started</a></li>
<li class="toctree-l2"><a class="reference internal" href="#environment-setup">Environment Setup</a></li>
<li class="toctree-l2"><a class="reference internal" href="#start-with-a-github-issue">Start with a GitHub Issue</a></li>
<li class="toctree-l2"><a class="reference internal" href="#get-the-up-to-date-code">Get the Up to Date Code</a></li>
<li class="toctree-l2"><a class="reference internal" href="#setup-your-dev-environment">Setup your Dev Environment</a></li>
<li class="toctree-l2"><a class="reference internal" href="#testing">Testing</a></li>
<li class="toctree-l2"><a class="reference internal" href="#code-quality">Code Quality</a></li>
<li class="toctree-l2"><a class="reference internal" href="#pull-request-pr">Pull Request (PR)</a></li>
Expand Down Expand Up @@ -106,38 +107,55 @@

<section id="contributing-to-wordview">
<h1>Contributing to wordview<a class="headerlink" href="#contributing-to-wordview" title="Permalink to this heading"></a></h1>
<p>Writing a random text to be removed in order to test semantic versioning.</p>
<p>Thank you for contributing to wordview! We and the users of this repo
appreciate your efforts! If you spot a problem or you have a feature request
or you wanted to suggest an improvement, please create an issue. Please
first search the existing open and closed issues
<a class="reference external" href="https://github.com/meghdadFar/wordview/issues">here</a>. If a related
issue already exists, you can add your comment and avoid creating
duplicate or very similar issues. If you come across an issue that you
would like to work on, feel free to <a class="reference external" href="#pull-request-pr">open a PR</a> for
it.</p>
<section id="getting-started">
<h2>Getting Started<a class="headerlink" href="#getting-started" title="Permalink to this heading"></a></h2>
<p>To begin contributing, clone the repository and make sure you are on
<code class="docutils literal notranslate"><span class="pre">main</span></code> branch. Then create your own branch.</p>
<p>Thank you for contributing to Wordview! We and the users of this repo
appreciate your efforts! Please follow the guidelines below to start contributing to Wordview.</p>
<section id="start-with-a-github-issue">
<h2>Start with a GitHub Issue<a class="headerlink" href="#start-with-a-github-issue" title="Permalink to this heading"></a></h2>
<p>If you are developing a feature, or you spot a problem, or you want to suggest an improvement,
please first search the issues to see whether a related issue already exists.
You can search for the existing issues <a class="reference external" href="https://github.com/meghdadFar/wordview/issues">here</a>.
If a related issue already exists, you can add comment and assign it to yourself,
otherwise, you are welcome to create a new issue.</p>
</section>
<section id="get-the-up-to-date-code">
<h2>Get the Up to Date Code<a class="headerlink" href="#get-the-up-to-date-code" title="Permalink to this heading"></a></h2>
<p>To begin contributing, clone the repository and make sure you are on <code class="docutils literal notranslate"><span class="pre">main</span></code> branch,
or if you have already cloned the repo, make sure you have the latest updates.</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span><span class="c1"># Clone the repo</span>
git<span class="w"> </span>clone<span class="w"> </span>git@github.com:meghdadFar/wordview.git

<span class="c1"># If it&#39;s been a while since you cloned, make sure you have the latest updates:</span>
git<span class="w"> </span>pull

<span class="c1"># Create a new branch</span>
</pre></div>
</div>
<p>You can now start working on your issue by creating a new branch.</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span><span class="c1"># Create a new branch</span>
git<span class="w"> </span>checkout<span class="w"> </span>-b<span class="w"> </span>&lt;branch_name&gt;
</pre></div>
</div>
<p>Please try to name your branch such that the name clarifies the purpose
of your branch, to some extent. We commonly use hyphenated branch names.
For instance, if you are developing an anomaly detection functionality
based on a normal distribution, a good branch name can be
<code class="docutils literal notranslate"><span class="pre">normal-dist-anomaly-detection</span></code>.</p>
<p>Please follow these guidelines for naming your branches:</p>
<p><strong>Use Descriptive Names:</strong> Branch names should provide a clear indication of the purpose or content of the branch. A developer should be able to understand the purpose of the branch just from its name.</p>
<p><strong>Use Hyphens:</strong> Stick to hyphens (“-”) to separate words in branch names. Avoid spaces or special characters that might cause issues on different systems or in URLs.</p>
<p><strong>Use Lowercase:</strong> Use lowercase letters when naming your branch.</p>
<p><strong>Include a Prefix</strong>: Use one of the following prefixes:</p>
<ul class="simple">
<li><p>feature/ for feature branches</p></li>
<li><p>bugfix/ for bug fix branches</p></li>
<li><p>hotfix/ for critical hotfix branches</p></li>
<li><p>release/ for release branches</p></li>
<li><p>chore/ for maintenance or housekeeping tasks</p></li>
</ul>
<p>Here aer some examples:</p>
<ul class="simple">
<li><p>feature/user-profile</p></li>
<li><p>bugfix/payment-gateway</p></li>
<li><p>hotfix/security-update</p></li>
<li><p>release/2.1.0</p></li>
<li><p>chore/update-dependencies</p></li>
</ul>
</section>
<section id="environment-setup">
<h2>Environment Setup<a class="headerlink" href="#environment-setup" title="Permalink to this heading"></a></h2>
<section id="setup-your-dev-environment">
<h2>Setup your Dev Environment<a class="headerlink" href="#setup-your-dev-environment" title="Permalink to this heading"></a></h2>
<p>We use <a class="reference external" href="https://pypi.org/project/poetry/">Poetry</a> to manage
dependencies and packaging. Follow these steps to set up your dev
environment:</p>
Expand Down
137 changes: 1 addition & 136 deletions docs/genindex.html
Original file line number Diff line number Diff line change
Expand Up @@ -97,93 +97,10 @@
<h1 id="index">Index</h1>

<div class="genindex-jumpbox">
<a href="#_"><strong>_</strong></a>
| <a href="#A"><strong>A</strong></a>
| <a href="#B"><strong>B</strong></a>
| <a href="#C"><strong>C</strong></a>
| <a href="#D"><strong>D</strong></a>
| <a href="#E"><strong>E</strong></a>
| <a href="#L"><strong>L</strong></a>
| <a href="#M"><strong>M</strong></a>
| <a href="#N"><strong>N</strong></a>
| <a href="#P"><strong>P</strong></a>
| <a href="#S"><strong>S</strong></a>
| <a href="#T"><strong>T</strong></a>
<a href="#M"><strong>M</strong></a>
| <a href="#W"><strong>W</strong></a>

</div>
<h2 id="_">_</h2>
<table style="width: 100%" class="indextable genindextable"><tr>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="api.html#wordview.anomaly.NormalDistAnomalies.__init__">__init__() (wordview.anomaly.NormalDistAnomalies method)</a>

<ul>
<li><a href="api.html#wordview.bias_analysis.bias.BiasDetector.__init__">(wordview.bias_analysis.bias.BiasDetector method)</a>
</li>
<li><a href="api.html#wordview.clustering.cluster.Cluster.__init__">(wordview.clustering.cluster.Cluster method)</a>
</li>
<li><a href="api.html#wordview.mwes.MWE.__init__">(wordview.mwes.MWE method)</a>
</li>
<li><a href="api.html#wordview.text_analysis.LabelStatsPlots.__init__">(wordview.text_analysis.LabelStatsPlots method)</a>
</li>
<li><a href="api.html#wordview.text_analysis.TextStatsPlots.__init__">(wordview.text_analysis.TextStatsPlots method)</a>
</li>
</ul></li>
</ul></td>
</tr></table>

<h2 id="A">A</h2>
<table style="width: 100%" class="indextable genindextable"><tr>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="api.html#wordview.anomaly.NormalDistAnomalies.anomalous_items">anomalous_items() (wordview.anomaly.NormalDistAnomalies method)</a>
</li>
</ul></td>
</tr></table>

<h2 id="B">B</h2>
<table style="width: 100%" class="indextable genindextable"><tr>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="api.html#wordview.bias_analysis.bias.BiasDetector">BiasDetector (class in wordview.bias_analysis.bias)</a>
</li>
</ul></td>
</tr></table>

<h2 id="C">C</h2>
<table style="width: 100%" class="indextable genindextable"><tr>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="api.html#wordview.clustering.cluster.Cluster">Cluster (class in wordview.clustering.cluster)</a>
</li>
</ul></td>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="api.html#wordview.clustering.cluster.Cluster.cluster">cluster() (wordview.clustering.cluster.Cluster method)</a>
</li>
</ul></td>
</tr></table>

<h2 id="D">D</h2>
<table style="width: 100%" class="indextable genindextable"><tr>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="api.html#wordview.bias_analysis.bias.BiasDetector.detect_bias">detect_bias() (wordview.bias_analysis.bias.BiasDetector method)</a>
</li>
</ul></td>
</tr></table>

<h2 id="E">E</h2>
<table style="width: 100%" class="indextable genindextable"><tr>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="api.html#wordview.mwes.MWE.extract_mwes">extract_mwes() (wordview.mwes.MWE method)</a>
</li>
</ul></td>
</tr></table>

<h2 id="L">L</h2>
<table style="width: 100%" class="indextable genindextable"><tr>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="api.html#wordview.text_analysis.LabelStatsPlots">LabelStatsPlots (class in wordview.text_analysis)</a>
</li>
</ul></td>
</tr></table>

<h2 id="M">M</h2>
<table style="width: 100%" class="indextable genindextable"><tr>
<td style="width: 33%; vertical-align: top;"><ul>
Expand All @@ -195,58 +112,6 @@ <h2 id="M">M</h2>
</li>
</ul></li>
</ul></td>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="api.html#wordview.mwes.MWE">MWE (class in wordview.mwes)</a>
</li>
</ul></td>
</tr></table>

<h2 id="N">N</h2>
<table style="width: 100%" class="indextable genindextable"><tr>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="api.html#wordview.anomaly.NormalDistAnomalies">NormalDistAnomalies (class in wordview.anomaly)</a>
</li>
</ul></td>
</tr></table>

<h2 id="P">P</h2>
<table style="width: 100%" class="indextable genindextable"><tr>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="api.html#wordview.bias_analysis.bias.BiasDetector.print_bias_table">print_bias_table() (wordview.bias_analysis.bias.BiasDetector method)</a>
</li>
</ul></td>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="api.html#wordview.mwes.MWE.print_mwe_table">print_mwe_table() (wordview.mwes.MWE method)</a>
</li>
</ul></td>
</tr></table>

<h2 id="S">S</h2>
<table style="width: 100%" class="indextable genindextable"><tr>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="api.html#wordview.bias_analysis.bias.BiasDetector.show_bias_plot">show_bias_plot() (wordview.bias_analysis.bias.BiasDetector method)</a>
</li>
<li><a href="api.html#wordview.text_analysis.TextStatsPlots.show_distplot">show_distplot() (wordview.text_analysis.TextStatsPlots method)</a>
</li>
<li><a href="api.html#wordview.text_analysis.TextStatsPlots.show_insights">show_insights() (wordview.text_analysis.TextStatsPlots method)</a>
</li>
</ul></td>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="api.html#wordview.text_analysis.LabelStatsPlots.show_label_plots">show_label_plots() (wordview.text_analysis.LabelStatsPlots method)</a>
</li>
<li><a href="api.html#wordview.text_analysis.TextStatsPlots.show_stats">show_stats() (wordview.text_analysis.TextStatsPlots method)</a>
</li>
<li><a href="api.html#wordview.text_analysis.TextStatsPlots.show_word_clouds">show_word_clouds() (wordview.text_analysis.TextStatsPlots method)</a>
</li>
</ul></td>
</tr></table>

<h2 id="T">T</h2>
<table style="width: 100%" class="indextable genindextable"><tr>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a href="api.html#wordview.text_analysis.TextStatsPlots">TextStatsPlots (class in wordview.text_analysis)</a>
</li>
</ul></td>
</tr></table>

<h2 id="W">W</h2>
Expand Down
Binary file modified docs/objects.inv
Binary file not shown.
2 changes: 1 addition & 1 deletion docs/searchindex.js

Large diffs are not rendered by default.

Loading

0 comments on commit cda571b

Please sign in to comment.