audubon: Japanese Text Processing Tools

A collection of Japanese text processing tools for filling Japanese iteration marks, Japanese character type conversions, segmentation by phrase, and text normalization which is based on rules for the 'Sudachi' morphological analyzer and the 'NEologd' (Neologism dictionary for 'MeCab'). These features are specific to Japanese and are not implemented in 'ICU' (International Components for Unicode).

Version: 0.5.2
Depends: R (≥ 2.10)
Imports: dplyr (≥ 1.1.0), magrittr, Matrix, memoise, purrr, readr, rlang (≥ 0.4.11), stringi, V8
Suggests: roxygen2, spelling, testthat (≥ 3.0.0)
Published: 2024-04-27
DOI: 10.32614/CRAN.package.audubon
Author: Akiru Kato [cre, aut], Koki Takahashi [cph] (Author of japanese.js), Shuhei Iitsuka [cph] (Author of budoux), Taku Kudo [cph] (Author of TinySegmenter)
Maintainer: Akiru Kato <paithiov909 at>
License: Apache License (≥ 2)
NeedsCompilation: no
Language: en-US
Materials: README NEWS
CRAN checks: audubon results


Reference manual: audubon.pdf


Package source: audubon_0.5.2.tar.gz
Windows binaries: r-devel:, r-release:, r-oldrel:
macOS binaries: r-release (arm64): audubon_0.5.2.tgz, r-oldrel (arm64): audubon_0.5.2.tgz, r-release (x86_64): audubon_0.5.2.tgz, r-oldrel (x86_64): audubon_0.5.2.tgz
Old sources: audubon archive


Please use the canonical form to link to this page.