Skip to content

Commit

Permalink
Merge pull request #58 from nickduran/update-newversion-branch
Browse files Browse the repository at this point in the history
Update newversion branch
  • Loading branch information
nickduran authored Jul 5, 2022
2 parents 90f3e08 + a7b1963 commit 3ef4a19
Show file tree
Hide file tree
Showing 112 changed files with 18,760 additions and 481 deletions.
5 changes: 3 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@ dist
*.ipynb_checkpoints
*.pyc
MANIFEST
doc/_build
doc/modules/generated
docs/
*.DS_Store
build/
sandbox/
venv/
38 changes: 32 additions & 6 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1,6 +1,32 @@
recursive-include *.py
recursive-include *.txt
recursive-include align *.md
graft doc
graft doc/_static
graft doc/_templates
# include src/align/data/*.txt
# include src/align/data/*.md
# recursive-include *.py
# recursive-include *.txt
# recursive-include align *.md
# graft doc
# graft doc/_static
# graft doc/_templates


# added by check-manifest
# recursive-include doc *.bat
# recursive-include doc *.buildinfo
# recursive-include doc *.css
# recursive-include doc *.doctree
# recursive-include doc *.eot
# recursive-include doc *.html
# recursive-include doc *.inv
# recursive-include doc *.js
# recursive-include doc *.pickle
# recursive-include doc *.png
# recursive-include doc *.py
# recursive-include doc *.rst
# recursive-include doc *.svg
# recursive-include doc *.ttf
# recursive-include doc *.txt
# recursive-include doc *.woff
# recursive-include doc *.woff2
# recursive-include doc Makefile
# recursive-include examples *.ipynb
recursive-include src *.md
recursive-include src *.txt
117 changes: 62 additions & 55 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,49 +1,53 @@
# ALIGN, a computational tool for multi-level language analysis (optimized for Python 3)
# ALIGN, a computational tool for multi-level language analysis (optimized for Python 3.10)

`align` is a Python library for extracting quantitative, reproducible
metrics of multi-level alignment between two speakers in naturalistic
language corpora. The method was introduced in "ALIGN: Analyzing
Linguistic Interactions with Generalizable techNiques" (Duran, Paxton, &
Fusaroli, 2019; Psychological Methods).

<!--
## Try out `align` with Binder
Interested in seeing how `align` works, but not sure if you want to install it
yet? Try it out through Binder. Click the "launch" button to get a complete
cloud environment to try out the ALIGN pipeline on our Python tutorials (the CHILDES
tutorial is currently the only one fully operational). The process for Binder to launch may
take several minutes.
take several minutes.
[![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/nickduran/align-linguistic-alignment/master)
-->

## Installation

`align` may downloaded directly using `pip`.
`align` may be downloaded directly using `pip`.

To download the stable version released on PyPI:

```
pip install align
```

To download directly from our GitHub repo:

```
pip install git+https://github.com/nickduran/align-linguistic-alignment.git
```

## Additional tools required for some `align` options

The Google News pre-trained word2vec vectors (`GoogleNews-vectors-negative300.bin`)
and the Stanford part-of-speech tagger (`stanford-postagger-full-2018-10-16`)
and the Stanford part-of-speech tagger (`stanford-postagger-full-2020-11-17`)
are required for some optional `align` parameters but must be downloaded
separately.

* Google News: https://code.google.com/archive/p/word2vec/ (page) or
https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit?usp=sharing
(direct download)
- Google News: https://code.google.com/archive/p/word2vec/ (page) or
https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit?usp=sharing
(direct download)

* Stanford POS tagger: https://nlp.stanford.edu/software/tagger.shtml#Download (page)
or https://nlp.stanford.edu/software/stanford-postagger-full-2018-10-16.zip
(direct download)
- Stanford POS tagger: https://nlp.stanford.edu/software/tagger.shtml#Download (page)
or https://nlp.stanford.edu/software/stanford-tagger-4.2.0.zip
(direct download)

## Tutorials

Expand All @@ -57,26 +61,27 @@ that helps streamline workflows. A major advantage is that Anaconda also makes i
to set up unique Python environments - which may be necessary to run `align`
and the tutorials given `align` is currently optimized for Python 3.

* **Jupyter Notebook 1: CHILDES**
* This tutorial walks users through an analysis of conversations from a
single English corpus from the CHILDES database (MacWhinney,
2000)---specifically, Kuczaj’s Abe corpus (Kuczaj, 1976). We analyze the
last 20 conversations in the corpus in order to explore how ALIGN can be
used to track multi-level linguistic alignment between a parent and child
over time, which may be of interest to developmental language researchers.
Specifically, we explore how alignment between a parent and a child
changes over a brief span of developmental trajectory.

* **Jupyter Notebook 2: Devil's Advocate**
* This tutorial walks users throught the analysis reported in (Duran,
Paxton, & Fusaroli, 2019). The corpus consists of 94 written
transcripts of conversations, lasting eight minutes each, collected from
an experimental study of truthful and deceptive communication. The goal
of the study was to examine interpersonal linguistic alignment between
dyads across two conversations where participants either agreed or
disagreed with each other (as a randomly assigned between-dyads condition)
and where one of the conversations involved the truth and the other
deception (as a within-subjects condition).
- **Jupyter Notebook 1: CHILDES**

- This tutorial walks users through an analysis of conversations from a
single English corpus from the CHILDES database (MacWhinney,
2000)---specifically, Kuczaj’s Abe corpus (Kuczaj, 1976). We analyze the
last 20 conversations in the corpus in order to explore how ALIGN can be
used to track multi-level linguistic alignment between a parent and child
over time, which may be of interest to developmental language researchers.
Specifically, we explore how alignment between a parent and a child
changes over a brief span of developmental trajectory.

- **Jupyter Notebook 2: Devil's Advocate**
- This tutorial walks users throught the analysis reported in (Duran,
Paxton, & Fusaroli, 2019). The corpus consists of 94 written
transcripts of conversations, lasting eight minutes each, collected from
an experimental study of truthful and deceptive communication. The goal
of the study was to examine interpersonal linguistic alignment between
dyads across two conversations where participants either agreed or
disagreed with each other (as a randomly assigned between-dyads condition)
and where one of the conversations involved the truth and the other
deception (as a within-subjects condition).

We are in the process of adding more tutorials and would welcome additional
tutorials by interested contributors.
Expand All @@ -85,31 +90,33 @@ tutorials by interested contributors.

If you find the package useful, please cite our manuscript:

>Duran, N., Paxton, A., & Fusaroli, R. (2019). ALIGN: Analyzing
> Linguistic Interactions with Generalizable techNiques. *Psychological Methods*. http://dynamicog.org/papers/
> Duran, N., Paxton, A., & Fusaroli, R. (2019). ALIGN: Analyzing
> Linguistic Interactions with Generalizable techNiques. _Psychological Methods_. http://dynamicog.org/papers/
## Licensing of example data

* **CHILDES**
* Example corpus "Kuczaj Corpus" by Stan Kuczaj is licensed under a
Creative Commons Attribution-ShareAlike 3.0 Unported License
(https://childes.talkbank.org/access/Eng-NA/Kuczaj.html):

> Kuczaj, S. (1977). The acquisition of regular and irregular past tense
> forms. *Journal of Verbal Learning and Verbal Behavior, 16*, 589–600.
* **Devil's Advocate**
* The complete de-identified dataset of raw conversational transcripts
is hosted on a secure protected-access repository provided by the
Inter-university Consortium for Political and Social Research
(ICPSR). Please click on the link to access: http://dx.doi.org/10.3886/ICPSR37124.v1.
Due to the requirements of our IRB, please note that users interested in
obtaining these data must complete a Restricted Data Use Agreement, specify
the reason for the request, and obtain IRB approval or notice of exemption for their research.

> Duran, Nicholas, Alexandra Paxton, and Riccardo
> Fusaroli. Conversational Transcripts of Truthful and
> Deceptive Speech Involving Controversial Topics,
> Central California, 2012. ICPSR37124-v1. Ann Arbor,
> MI: Inter-university Consortium for Political and
> Social Research [distributor], 2018-08-29.
- **CHILDES**

- Example corpus "Kuczaj Corpus" by Stan Kuczaj is licensed under a
Creative Commons Attribution-ShareAlike 3.0 Unported License
(https://childes.talkbank.org/access/Eng-NA/Kuczaj.html):

> Kuczaj, S. (1977). The acquisition of regular and irregular past tense
> forms. _Journal of Verbal Learning and Verbal Behavior, 16_, 589–600.
- **Devil's Advocate**

- The complete de-identified dataset of raw conversational transcripts
is hosted on a secure protected-access repository provided by the
Inter-university Consortium for Political and Social Research
(ICPSR). Please click on the link to access: http://dx.doi.org/10.3886/ICPSR37124.v1.
Due to the requirements of our IRB, please note that users interested in
obtaining these data must complete a Restricted Data Use Agreement, specify
the reason for the request, and obtain IRB approval or notice of exemption for their research.

> Duran, Nicholas, Alexandra Paxton, and Riccardo
> Fusaroli. Conversational Transcripts of Truthful and
> Deceptive Speech Involving Controversial Topics,
> Central California, 2012. ICPSR37124-v1. Ann Arbor,
> MI: Inter-university Consortium for Political and
> Social Research [distributor], 2018-08-29.
1 change: 0 additions & 1 deletion binder/apt.txt

This file was deleted.

23 changes: 0 additions & 23 deletions binder/postBuild

This file was deleted.

1 change: 0 additions & 1 deletion binder/runtime.txt

This file was deleted.

10 changes: 5 additions & 5 deletions doc/Makefile
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = python -msphinx
SPHINXPROJ = ALIGN
# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build

Expand All @@ -17,4 +17,4 @@ help:
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
Binary file added doc/_build/doctrees/calculate_alignment.doctree
Binary file not shown.
Binary file added doc/_build/doctrees/environment.pickle
Binary file not shown.
Binary file added doc/_build/doctrees/index.doctree
Binary file not shown.
Binary file added doc/_build/doctrees/modules.doctree
Binary file not shown.
Binary file added doc/_build/doctrees/prepare_transcripts.doctree
Binary file not shown.
4 changes: 4 additions & 0 deletions doc/_build/html/.buildinfo
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 92fb6c56a78588f6ece5cb0d20061f36
tags: 645f666f9bcd5a90fca523b33c5a78b7
7 changes: 7 additions & 0 deletions doc/_build/html/_sources/calculate_alignment.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
calculate\_alignment module
===========================

.. automodule:: calculate_alignment
:members:
:undoc-members:
:show-inheritance:
22 changes: 22 additions & 0 deletions doc/_build/html/_sources/index.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
.. ALIGN documentation master file, created by
sphinx-quickstart on Mon Jul 4 09:14:01 2022.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Welcome to ALIGN's documentation!
=================================

.. toctree::
:maxdepth: 2
:caption: Contents:

modules



Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
8 changes: 8 additions & 0 deletions doc/_build/html/_sources/modules.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
align
=====

.. toctree::
:maxdepth: 4

calculate_alignment
prepare_transcripts
7 changes: 7 additions & 0 deletions doc/_build/html/_sources/prepare_transcripts.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
prepare\_transcripts module
===========================

.. automodule:: prepare_transcripts
:members:
:undoc-members:
:show-inheritance:
Loading

0 comments on commit 3ef4a19

Please sign in to comment.