VEP has been busy improving its visualization tools and processing pipeline! You can read about all the changes in the list below.
- Text Processing Pipeline 2.0 features better Unicode handling during character cleaning and a dictionary that standardizes spelling variation across TCP corpora. Read about the pipeline on the ‘Workflow’ page. Download the pipeline from GitHub.
- TextDNA is available for download! The download includes sample datasets and Python scripts for curating your own TextDNA datasets. Download it from GitHub.
- Ubiqu+Ity 1.2 is officially released! The SlimTV (or Slim TextViewer) replaces Ubiqu+Ity HTML files for navigating tagged text.
- Updated corpora (processed with pipeline 2.0) are available for download.