A Python project aiming at finding homogeneous sub columns of a free text column.
To use colsplit do the following:
git clone https://github.com/patrickwestphal/colsplit.git
pip install -e colsplit/
Maybe you will have to install numpy on your system first.
To use the colsplit standalone tool just call colsplit -h
and have a look at the command line options.