Package: NLP 0.3-0

NLP: Natural Language Processing Infrastructure

Basic classes and methods for Natural Language Processing.

Authors:Kurt Hornik [aut, cre]

NLP_0.3-0.tar.gz
NLP_0.3-0.zip(r-4.5)NLP_0.3-0.zip(r-4.4)NLP_0.3-0.zip(r-4.3)
NLP_0.3-0.tgz(r-4.4-any)NLP_0.3-0.tgz(r-4.3-any)
NLP_0.3-0.tar.gz(r-4.5-noble)NLP_0.3-0.tar.gz(r-4.4-noble)
NLP_0.3-0.tgz(r-4.4-emscripten)NLP_0.3-0.tgz(r-4.3-emscripten)
NLP.pdf |NLP.html
NLP/json (API)

# Install 'NLP' in R:
install.packages('NLP', repos = c('https://kurthornik.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

On CRAN:

This package does not link to any Github/Gitlab/R-forge repository. No issue tracker or development information is available.

67 exports 6 stars 8.81 score 0 dependencies 124 dependents 43 mentions 944 scripts 30.5k downloads

Last updated 2 months agofrom:f60999ca21. Checks:OK: 3 NOTE: 4. Indexed: yes.

TargetResultDate
Doc / VignettesOKSep 20 2024
R-4.5-winOKSep 20 2024
R-4.5-linuxOKSep 20 2024
R-4.4-winNOTESep 20 2024
R-4.4-macNOTESep 20 2024
R-4.3-winNOTESep 20 2024
R-4.3-macNOTESep 20 2024

Exports:annotateAnnotatedPlainTextDocumentannotationAnnotationannotations_in_spansAnnotatorAnnotator_Pipelineas.Annotationas.Annotator_Pipelineas.Spanas.Span_Tokenizeras.Stringas.Tagged_Tokenas.Token_Tokenizerblankline_tokenizerBrown_POS_tagschunked_sentsCoNLLTextDocumentCoNLLUTextDocumentcontentcontent<-featuresis.Annotationis.Spanis.Span_Tokenizeris.Stringis.Tagged_Tokenis.Token_Tokenizermetameta<-next_idngramsotoksparasparse_IETF_language_tagparse_ISO_8601_datetimeparsed_parasparsed_sentsPenn_Treebank_POS_tagsRegexp_TokenizersentsSimple_Chunk_AnnotatorSimple_Entity_AnnotatorSimple_Para_Token_AnnotatorSimple_POS_Tag_AnnotatorSimple_Sent_Token_AnnotatorSimple_Stem_AnnotatorSimple_Word_Token_Annotatorsingle_featureSpanSpan_TokenizerStringtagged_parastagged_sentsTagged_Tokentagged_wordsTaggedTextDocumentToken_TokenizerTreeTree_applyTree_parseUniversal_POS_tagsUniversal_POS_tags_mapwhitespace_tokenizerWordListDocumentwordpunct_tokenizerwords

Dependencies:

Readme and manuals

Help Manual

Help pageTopics
Annotate text stringsannotate
Annotated Plain Text DocumentsAnnotatedPlainTextDocument annotation
Annotation objects$<-.Annotation Annotation as.Annotation as.Annotation.Span as.data.frame.Annotation as.list.Annotation c.Annotation duplicated.Annotation format.Annotation is.Annotation length.Annotation merge.Annotation meta.Annotation meta<-.Annotation names.Annotation print.Annotation subset.Annotation unique.Annotation [.Annotation [[.Annotation
Annotations contained in character spansannotations_in_spans
Annotator (pipeline) objectsAnnotator Annotator_Pipeline as.Annotator_Pipeline
Simple annotator generatorsSimple annotator generators Simple_Chunk_Annotator Simple_Entity_Annotator Simple_Para_Token_Annotator Simple_POS_Tag_Annotator Simple_Sent_Token_Annotator Simple_Stem_Annotator Simple_Word_Token_Annotator
CoNLL-Style Text DocumentsCoNLLTextDocument
CoNNL-U Text DocumentsCoNLLUTextDocument
Parse ISO 8601 Date/Time Stringsparse_ISO_8601_datetime
Extract Annotation Featuresfeatures
Access or Modify Content or Metadatacontent content<- meta meta<-
Parse IETF Language Tagparse_IETF_language_tag
Compute N-Gramsngrams
Span objects$<-.Span as.data.frame.Span as.list.Span as.Span c.Span duplicated.Span format.Span is.Span length.Span names.Span Ops.Span print.Span Span unique.Span [.Span [[.Span
String objectsas.String is.String String
Tagged_Token objects$<-.Tagged_Token as.data.frame.Tagged_Token as.list.Tagged_Token as.Tagged_Token c.Tagged_Token duplicated.Tagged_Token format.Tagged_Token is.Tagged_Token length.Tagged_Token names.Tagged_Token print.Tagged_Token Tagged_Token unique.Tagged_Token [.Tagged_Token [[.Tagged_Token
POS-Tagged Word Text DocumentsTaggedTextDocument
NLP Tag SetsBrown_POS_tags Penn_Treebank_POS_tags Universal_POS_tags Universal_POS_tags_map
Text DocumentsTextDocument
Tokenizer objectsas.Span_Tokenizer as.Token_Tokenizer is.Span_Tokenizer is.Token_Tokenizer Span_Tokenizer Token_Tokenizer
Regexp tokenizersblankline_tokenizer Regexp_Tokenizer whitespace_tokenizer wordpunct_tokenizer
Tree objectsformat.Tree print.Tree Tree Tree_apply Tree_parse
Annotation Utilitiesnext_id single_feature
Text Document Viewerschunked_sents otoks paras parsed_paras parsed_sents sents tagged_paras tagged_sents tagged_words words
Word List Text DocumentsWordListDocument