example.assets.example_components package
This module is auto generated by azure-ml-component.
- Assets included:
azureml://feeds/azureml
azureml://feeds/huggingface
- class example.assets.example_components.Datasets
Bases:
object- property acronym_identification_default
Huggingface acronym_identification-default dataset
- property ade_corpus_v2_ade_corpus_v2_classification
Huggingface ade_corpus_v2-Ade_corpus_v2_classification dataset
- property ade_corpus_v2_ade_corpus_v2_drug_ade_relation
Huggingface ade_corpus_v2-Ade_corpus_v2_drug_ade_relation dataset
- property ade_corpus_v2_ade_corpus_v2_drug_dosage_relation
Huggingface ade_corpus_v2-Ade_corpus_v2_drug_dosage_relation dataset
- property adult_census_income_binary_classification_dataset
Census Income dataset
- property adversarial_qa_adversarialqa
Huggingface adversarial_qa-adversarialQA dataset
- property adversarial_qa_dbert
Huggingface adversarial_qa-dbert dataset
- property adversarial_qa_dbidaf
Huggingface adversarial_qa-dbidaf dataset
- property adversarial_qa_droberta
Huggingface adversarial_qa-droberta dataset
- property aeslc_default
Huggingface aeslc-default dataset
- property afrikaans_ner_corpus_afrikaans_ner_corpus
Huggingface afrikaans_ner_corpus-afrikaans_ner_corpus dataset
- property ag_news_default
Huggingface ag_news-default dataset
- property ai2_arc_arc_challenge
Huggingface ai2_arc-ARC-Challenge dataset
- property ai2_arc_arc_easy
Huggingface ai2_arc-ARC-Easy dataset
- property air_dialogue_air_dialogue_data
Huggingface air_dialogue-air_dialogue_data dataset
- property air_dialogue_air_dialogue_kb
Huggingface air_dialogue-air_dialogue_kb dataset
- property akhooli_gpt2_small_arabic
Huggingface akhooli/gpt2-small-arabic model
- property akhooli_gpt2_small_arabic_poetry
Huggingface akhooli/gpt2-small-arabic-poetry model
- property allegro_reviews_default
Huggingface allegro_reviews-default dataset
- property allocine_allocine
Huggingface allocine-allocine dataset
- property alt_alt_km
Huggingface alt-alt-km dataset
- property alt_alt_my
Huggingface alt-alt-my dataset
- property alt_alt_my_transliteration
Huggingface alt-alt-my-transliteration dataset
- property alt_alt_my_west_transliteration
Huggingface alt-alt-my-west-transliteration dataset
- property alt_alt_parallel
Huggingface alt-alt-parallel dataset
- property alvaroalon2_biobert_chemical_ner
Huggingface alvaroalon2/biobert_chemical_ner model
- property alvaroalon2_biobert_diseases_ner
Huggingface alvaroalon2/biobert_diseases_ner model
- property amazon_polarity_amazon_polarity
Huggingface amazon_polarity-amazon_polarity dataset
- property amazon_reviews_multi_all_languages
Huggingface amazon_reviews_multi-all_languages dataset
- property amazon_reviews_multi_de
Huggingface amazon_reviews_multi-de dataset
- property amazon_reviews_multi_en
Huggingface amazon_reviews_multi-en dataset
- property amazon_reviews_multi_es
Huggingface amazon_reviews_multi-es dataset
- property amazon_reviews_multi_fr
Huggingface amazon_reviews_multi-fr dataset
- property amazon_reviews_multi_ja
Huggingface amazon_reviews_multi-ja dataset
- property amazon_reviews_multi_zh
Huggingface amazon_reviews_multi-zh dataset
- property ambig_qa_full
Huggingface ambig_qa-full dataset
- property ambig_qa_light
Huggingface ambig_qa-light dataset
- property amttl_amttl
Huggingface amttl-amttl dataset
- property animal_images_dataset
This sample dataset is derived from Open Image Dataset and includes 3 animal categories (cat, dog, frog). Each category contains 10 images.
- property anli_plain_text
Huggingface anli-plain_text dataset
- property app_reviews_default
Huggingface app_reviews-default dataset
- property aqua_rat_raw
Huggingface aqua_rat-raw dataset
- property aqua_rat_tokenized
Huggingface aqua_rat-tokenized dataset
- property ar_res_reviews_default
Huggingface ar_res_reviews-default dataset
- property arcd_plain_text
Huggingface arcd-plain_text dataset
- property arsentd_lev_default
Huggingface arsentd_lev-default dataset
- property art_anli
Huggingface art-anli dataset
- property asi_gpt_fr_cased_small
Huggingface asi/gpt-fr-cased-small model
- property aslg_pc12_default
Huggingface aslg_pc12-default dataset
- property asset_ratings
Huggingface asset-ratings dataset
- property asset_simplification
Huggingface asset-simplification dataset
- property assin2_default
Huggingface assin2-default dataset
- property assin_full
Huggingface assin-full dataset
- property assin_ptbr
Huggingface assin-ptbr dataset
- property assin_ptpt
Huggingface assin-ptpt dataset
- property atomic_atomic
Huggingface atomic-atomic dataset
- property automobile_price_data_raw
Clean missing data module required. Prices of various automobiles against make, model and technical specifications
- property autshumato_autshumato_en_tn
Huggingface autshumato-autshumato-en-tn dataset
- property autshumato_autshumato_en_ts
Huggingface autshumato-autshumato-en-ts dataset
- property autshumato_autshumato_en_ts_manual
Huggingface autshumato-autshumato-en-ts-manual dataset
- property autshumato_autshumato_en_zu
Huggingface autshumato-autshumato-en-zu dataset
- property avichr_hebert_sentiment_analysis
Huggingface avichr/heBERT_sentiment_analysis model
- property bert_base_uncased
Huggingface bert-base-uncased model
- property bertin_project_bertin_base_ner_conll2002_es
Huggingface bertin-project/bertin-base-ner-conll2002-es model
- property bsc_tecla_tecla
Huggingface bsc/tecla-tecla dataset
- property cahya_gpt2_small_indonesian_522m
Huggingface cahya/gpt2-small-indonesian-522M model
- property cahya_gpt2_small_indonesian_story
Huggingface cahya/gpt2-small-indonesian-story model
- property capreolus_bert_base_msmarco
Huggingface Capreolus/bert-base-msmarco model
- property cardiffnlp_twitter_roberta_base_emotion
Huggingface cardiffnlp/twitter-roberta-base-emotion model
- property chambliss_distilbert_for_food_extraction
Huggingface chambliss/distilbert-for-food-extraction model
- property ckiplab_albert_base_chinese_ner
Huggingface ckiplab/albert-base-chinese-ner model
- property ckiplab_albert_base_chinese_pos
Huggingface ckiplab/albert-base-chinese-pos model
- property ckiplab_albert_base_chinese_ws
Huggingface ckiplab/albert-base-chinese-ws model
- property ckiplab_albert_tiny_chinese_ws
Huggingface ckiplab/albert-tiny-chinese-ws model
- property ckiplab_bert_base_chinese_ner
Huggingface ckiplab/bert-base-chinese-ner model
- property ckiplab_bert_base_chinese_pos
Huggingface ckiplab/bert-base-chinese-pos model
- property ckiplab_bert_base_chinese_ws
Huggingface ckiplab/bert-base-chinese-ws model
- property colorfulscoop_gpt2_small_ja
Huggingface colorfulscoop/gpt2-small-ja model
CRM Appetency Labels
CRM Churn Labels
CRM Dataset
CRM Upselling Labels
- property cross_encoder_ms_marco_electra_base
Huggingface cross-encoder/ms-marco-electra-base model
- property cross_encoder_stsb_tinybert_l_4
Huggingface cross-encoder/stsb-TinyBERT-L-4 model
- property datificate_gpt2_small_spanish
Huggingface datificate/gpt2-small-spanish model
- property dbmdz_bert_base_cased_finetuned_conll03_english
Huggingface dbmdz/bert-base-cased-finetuned-conll03-english model
- property distilbert_base_uncased_finetuned_sst_2_english
Huggingface distilbert-base-uncased-finetuned-sst-2-english model
- property dslim_bert_base_ner
Huggingface dslim/bert-base-NER model
- property dslim_bert_base_ner_uncased
Huggingface dslim/bert-base-NER-uncased model
- property elastic_distilbert_base_cased_finetuned_conll03_english
Huggingface elastic/distilbert-base-cased-finetuned-conll03-english model
- property elastic_distilbert_base_uncased_finetuned_conll03_english
Huggingface elastic/distilbert-base-uncased-finetuned-conll03-english model
- property ethanyt_guwen_ner
Huggingface ethanyt/guwen-ner model
- property ethanyt_guwen_punc
Huggingface ethanyt/guwen-punc model
- property ferch423_gpt2_small_portuguese_wikipediabio
Huggingface Ferch423/gpt2-small-portuguese-wikipediabio model
- property finiteautomata_bertweet_base_sentiment_analysis
Huggingface finiteautomata/bertweet-base-sentiment-analysis model
- property finiteautomata_beto_sentiment_analysis
Huggingface finiteautomata/beto-sentiment-analysis model
- property flight_delays_data
Flight Delays Data
- property german_credit_card_uci_dataset
German Credit Card UCI dataset
- property gilf_french_camembert_postag_model
Huggingface gilf/french-camembert-postag-model model
- property glue_ax
Huggingface glue-ax dataset
- property glue_cola
Huggingface glue-cola dataset
- property glue_mnli
Huggingface glue-mnli dataset
- property glue_mnli_matched
Huggingface glue-mnli_matched dataset
- property glue_mnli_mismatched
Huggingface glue-mnli_mismatched dataset
- property glue_mrpc
Huggingface glue-mrpc dataset
- property glue_qnli
Huggingface glue-qnli dataset
- property glue_qqp
Huggingface glue-qqp dataset
- property glue_rte
Huggingface glue-rte dataset
- property glue_sst2
Huggingface glue-sst2 dataset
- property glue_stsb
Huggingface glue-stsb dataset
- property glue_wnli
Huggingface glue-wnli dataset
- property gronlp_gpt2_small_italian
Huggingface GroNLP/gpt2-small-italian model
- property gunghio_distilbert_base_multilingual_cased_finetuned_conll2003_ner
Huggingface gunghio/distilbert-base-multilingual-cased-finetuned-conll2003-ner model
- property hf_internal_testing_tiny_xlm_roberta
Huggingface hf-internal-testing/tiny-xlm-roberta model
- property imdb_movie_titles
IMDB Movie Titles
- property imdb_plain_text
Huggingface imdb-plain_text dataset
- property jsfoon_slogan_generator
Huggingface jsfoon/slogan-generator model
- property lilaboualili_bert_vanilla
Huggingface LilaBoualili/bert-vanilla model
- property lordtt13_emo_mobilebert
Huggingface lordtt13/emo-mobilebert model
- property maltehb_l_ctra_danish_electra_small_uncased_ner_dane
Huggingface Maltehb/-l-ctra-danish-electra-small-uncased-ner-dane model
- property media1129_recipe_tag_model
Huggingface Media1129/recipe-tag-model model
- property microsoft_codegpt_small_py
Huggingface microsoft/CodeGPT-small-py model
- property microsoft_codegpt_small_py_adaptedgpt2
Huggingface microsoft/CodeGPT-small-py-adaptedGPT2 model
- property microsoft_minilm_l12_h384_uncased
Huggingface microsoft/MiniLM-L12-H384-uncased model
- property movie_ratings
Movie Ratings
- property mrm8488_bert_spanish_cased_finetuned_ner
Huggingface mrm8488/bert-spanish-cased-finetuned-ner model
- property mrm8488_bert_tiny_finetuned_sms_spam_detection
Huggingface mrm8488/bert-tiny-finetuned-sms-spam-detection model
- property mrm8488_codebert_base_finetuned_stackoverflow_ner
Huggingface mrm8488/codebert-base-finetuned-stackoverflow-ner model
- property mrm8488_mobilebert_finetuned_ner
Huggingface mrm8488/mobilebert-finetuned-ner model
- property mrm8488_mobilebert_finetuned_pos
Huggingface mrm8488/mobilebert-finetuned-pos model
- property myx4567_distilgpt2_finetuned_wikitext2
Huggingface MYX4567/distilgpt2-finetuned-wikitext2 model
- property narsil_tiny_distilbert_sequence_classification
Huggingface Narsil/tiny-distilbert-sequence-classification model
- property nateraw_bert_base_uncased_emotion
Huggingface nateraw/bert-base-uncased-emotion model
- property oliverguhr_german_sentiment_bert
Huggingface oliverguhr/german-sentiment-bert model
- property philschmid_distilroberta_base_ner_conll2003
Huggingface philschmid/distilroberta-base-ner-conll2003 model
- property pierreguillou_gpt2_small_portuguese
Huggingface pierreguillou/gpt2-small-portuguese model
- property pierrerappolt_disease_extraction
Huggingface pierrerappolt/disease-extraction model
Huggingface pranavpsv/gpt2-genre-story-generator model
- property prosusai_finbert
Huggingface ProsusAI/finbert model
- property proycon_bert_ner_cased_sonar1_nld
Huggingface proycon/bert-ner-cased-sonar1-nld model
- property recordedfuture_swedish_ner
Huggingface RecordedFuture/Swedish-NER model
- property restaurant_customer_data
Contains customer features, such as drink_level, dress_preference and marital_status.
- property restaurant_feature_data
Contains restaurant features, such as name, address and dress_code.
- property restaurant_ratings
Contains ratings given by customers to restaurants on scale from 0 to 2.
- property sgugger_tiny_distilbert_classification
Huggingface sgugger/tiny-distilbert-classification model
- property squad_adversarial_addonesent
Huggingface squad_adversarial-AddOneSent dataset
- property squad_adversarial_addsent
Huggingface squad_adversarial-AddSent dataset
- property squad_es_v1_1_0
Huggingface squad_es-v1.1.0 dataset
- property squad_it_default
Huggingface squad_it-default dataset
- property squad_plain_text
Huggingface squad-plain_text dataset
- property squad_v1_pt_default
Huggingface squad_v1_pt-default dataset
- property squad_v2_squad_v2
Huggingface squad_v2-squad_v2 dataset
- property squadshifts_amazon
Huggingface squadshifts-amazon dataset
- property squadshifts_new_wiki
Huggingface squadshifts-new_wiki dataset
- property squadshifts_nyt
Huggingface squadshifts-nyt dataset
- property squadshifts_reddit
Huggingface squadshifts-reddit dataset
- property sshleifer_tiny_ctrl
Huggingface sshleifer/tiny-ctrl model
- property sshleifer_tiny_dbmdz_bert_large_cased_finetuned_conll03_english
Huggingface sshleifer/tiny-dbmdz-bert-large-cased-finetuned-conll03-english model
- property sshleifer_tiny_distilbert_base_cased
Huggingface sshleifer/tiny-distilbert-base-cased model
- property sshleifer_tiny_distilbert_base_uncased_finetuned_sst_2_english
Huggingface sshleifer/tiny-distilbert-base-uncased-finetuned-sst-2-english model
- property sshleifer_tiny_gpt2
Huggingface sshleifer/tiny-gpt2 model
- property sshleifer_tiny_xlnet_base_cased
Huggingface sshleifer/tiny-xlnet-base-cased model
- property super_glue_axb
Huggingface super_glue-axb dataset
- property super_glue_axg
Huggingface super_glue-axg dataset
- property super_glue_boolq
Huggingface super_glue-boolq dataset
- property super_glue_cb
Huggingface super_glue-cb dataset
- property super_glue_copa
Huggingface super_glue-copa dataset
- property super_glue_multirc
Huggingface super_glue-multirc dataset
- property super_glue_record
Huggingface super_glue-record dataset
- property super_glue_rte
Huggingface super_glue-rte dataset
- property super_glue_wic
Huggingface super_glue-wic dataset
- property super_glue_wsc
Huggingface super_glue-wsc dataset
- property super_glue_wsc_fixed
Huggingface super_glue-wsc.fixed dataset
- property swag_full
Huggingface swag-full dataset
- property swag_regular
Huggingface swag-regular dataset
- property swahili_news_swahili_news
Huggingface swahili_news-swahili_news dataset
- property swahili_swahili
Huggingface swahili-swahili dataset
- property swda_default
Huggingface swda-default dataset
- property swedish_ner_corpus_default
Huggingface swedish_ner_corpus-default dataset
- property swedish_reviews_plain_text
Huggingface swedish_reviews-plain_text dataset
- property tab_fact_blind_test
Huggingface tab_fact-blind_test dataset
- property tab_fact_tab_fact
Huggingface tab_fact-tab_fact dataset
- property tamilmixsentiment_default
Huggingface tamilmixsentiment-default dataset
- property tanzil_bg_en
Huggingface tanzil-bg-en dataset
- property tanzil_bn_hi
Huggingface tanzil-bn-hi dataset
- property tanzil_en_tr
Huggingface tanzil-en-tr dataset
- property tanzil_fa_sv
Huggingface tanzil-fa-sv dataset
- property tanzil_ru_zh
Huggingface tanzil-ru-zh dataset
- property tapaco_en
Huggingface tapaco-en dataset
- property tapaco_eo
Huggingface tapaco-eo dataset
- property tapaco_es
Huggingface tapaco-es dataset
- property tapaco_et
Huggingface tapaco-et dataset
- property tapaco_eu
Huggingface tapaco-eu dataset
- property tapaco_fi
Huggingface tapaco-fi dataset
- property tapaco_fr
Huggingface tapaco-fr dataset
- property tapaco_gl
Huggingface tapaco-gl dataset
- property tapaco_gos
Huggingface tapaco-gos dataset
- property textattack_bert_base_uncased_cola
Huggingface textattack/bert-base-uncased-CoLA model
- property textattack_bert_base_uncased_imdb
Huggingface textattack/bert-base-uncased-imdb model
- property textattack_bert_base_uncased_mnli
Huggingface textattack/bert-base-uncased-MNLI model
- property textattack_bert_base_uncased_snli
Huggingface textattack/bert-base-uncased-snli model
- property textattack_bert_base_uncased_sst_2
Huggingface textattack/bert-base-uncased-SST-2 model
- property textattack_distilbert_base_uncased_imdb
Huggingface textattack/distilbert-base-uncased-imdb model
- property textattack_distilbert_base_uncased_rotten_tomatoes
Huggingface textattack/distilbert-base-uncased-rotten-tomatoes model
- property textattack_roberta_base_imdb
Huggingface textattack/roberta-base-imdb model
- property textattack_xlnet_base_cased_imdb
Huggingface textattack/xlnet-base-cased-imdb model
- property transformersbook_codepage_small
Huggingface transformersbook/codepage-small model
- property uer_gpt2_chinese_poem
Huggingface uer/gpt2-chinese-poem model
- property uer_roberta_base_finetuned_cluener2020_chinese
Huggingface uer/roberta-base-finetuned-cluener2020-chinese model
- property unitary_toxic_bert
Huggingface unitary/toxic-bert model
- property vblagoje_bert_english_uncased_finetuned_pos
Huggingface vblagoje/bert-english-uncased-finetuned-pos model
- property vishnun_distilgpt2_finetuned_distilgpt2_med_articles
Huggingface vishnun/distilgpt2-finetuned-distilgpt2-med_articles model
- property vishnun_distilgpt2_finetuned_tamilmixsentiment
Huggingface vishnun/distilgpt2-finetuned-tamilmixsentiment model
- property weather_dataset
Weather Dataset
- property wietsedv_bert_base_multilingual_cased_finetuned_conll2002_ner
Huggingface wietsedv/bert-base-multilingual-cased-finetuned-conll2002-ner model
- property wikipedia_sp_500_dataset
Wikipedia SP 500 Dataset
- property xlnet_base_cased
Huggingface xlnet-base-cased model
- example.assets.example_components.azureml_add_columns(left_dataset: Optional[pathlib.Path] = None, right_dataset: Optional[pathlib.Path] = None) example.assets.example_components._assets._AzuremlAddColumnsComponent
Adds a set of columns from one dataset to another.
- Parameters
left_dataset (Path) – Left dataset
right_dataset (Path) – Right dataset
- Output combined_dataset
Combined dataset
- Type
combined_dataset: Output
- example.assets.example_components.azureml_add_rows(dataset1: Optional[pathlib.Path] = None, dataset2: Optional[pathlib.Path] = None) example.assets.example_components._assets._AzuremlAddRowsComponent
Appends a set of rows from an input dataset to the end of another dataset.
- Parameters
dataset1 (Path) – Dataset rows to be added to the output dataset first
dataset2 (Path) – Dataset rows to be appended to the first dataset
- Output results_dataset
Dataset that contains all rows of both input datasets
- Type
results_dataset: Output
- example.assets.example_components.azureml_apply_image_transformation(input_image_transformation: Optional[pathlib.Path] = None, input_image_directory: Optional[pathlib.Path] = None, mode: Optional[example.assets.example_components._assets._AzuremlApplyImageTransformationModeEnum] = None) example.assets.example_components._assets._AzuremlApplyImageTransformationComponent
Applies a image transformation to a image directory.
- Parameters
input_image_transformation (Path) – Input image transformation
input_image_directory (Path) – Input image directory
mode (_AzuremlApplyImageTransformationModeEnum) – Should exclude ‘Random’ transform operations in inference but keep them in training (enum: [‘For training’, ‘For inference’])
- Output output_image_directory
Output image directory
- Type
output_image_directory: Output
- example.assets.example_components.azureml_apply_math_operation(input: Optional[pathlib.Path] = None, category: example.assets.example_components._assets._AzuremlApplyMathOperationCategoryEnum = _AzuremlApplyMathOperationCategoryEnum.basic, basic_func: example.assets.example_components._assets._AzuremlApplyMathOperationBasicFuncEnum = _AzuremlApplyMathOperationBasicFuncEnum.abs, basic_arg_type: example.assets.example_components._assets._AzuremlApplyMathOperationBasicArgTypeEnum = _AzuremlApplyMathOperationBasicArgTypeEnum.constant, basic_constant: float = 1, basic_column_selector: Optional[str] = None, compare_func: example.assets.example_components._assets._AzuremlApplyMathOperationCompareFuncEnum = _AzuremlApplyMathOperationCompareFuncEnum.equalto, compare_arg_type: example.assets.example_components._assets._AzuremlApplyMathOperationCompareArgTypeEnum = _AzuremlApplyMathOperationCompareArgTypeEnum.constant, compare_constant: float = 1, compare_column_selector: Optional[str] = None, operations_func: example.assets.example_components._assets._AzuremlApplyMathOperationOperationsFuncEnum = _AzuremlApplyMathOperationOperationsFuncEnum.add, operations_arg_type: example.assets.example_components._assets._AzuremlApplyMathOperationOperationsArgTypeEnum = _AzuremlApplyMathOperationOperationsArgTypeEnum.constant, operations_constant: float = 1, operations_column_selector: Optional[str] = None, rounding_func: example.assets.example_components._assets._AzuremlApplyMathOperationRoundingFuncEnum = _AzuremlApplyMathOperationRoundingFuncEnum.ceiling, rounding_arg_type: example.assets.example_components._assets._AzuremlApplyMathOperationRoundingArgTypeEnum = _AzuremlApplyMathOperationRoundingArgTypeEnum.constant, rounding_constant: float = 1, rounding_column_selector: Optional[str] = None, special_func: example.assets.example_components._assets._AzuremlApplyMathOperationSpecialFuncEnum = _AzuremlApplyMathOperationSpecialFuncEnum.beta, special_arg_type: example.assets.example_components._assets._AzuremlApplyMathOperationSpecialArgTypeEnum = _AzuremlApplyMathOperationSpecialArgTypeEnum.constant, special_constant: float = 1, special_column_selector: Optional[str] = None, trigonometric_func: example.assets.example_components._assets._AzuremlApplyMathOperationTrigonometricFuncEnum = _AzuremlApplyMathOperationTrigonometricFuncEnum.acos, column_selector: Optional[str] = None, output_mode: example.assets.example_components._assets._AzuremlApplyMathOperationOutputModeEnum = _AzuremlApplyMathOperationOutputModeEnum.append) example.assets.example_components._assets._AzuremlApplyMathOperationComponent
Applies a mathematical operation to column values.
- Parameters
input (Path) – DataFrameDirectory
category (_AzuremlApplyMathOperationCategoryEnum) – enum (enum: [‘Basic’, ‘Compare’, ‘Operations’, ‘Rounding’, ‘Special’, ‘Trigonometric’])
basic_func (_AzuremlApplyMathOperationBasicFuncEnum) – enum (optional, enum: [‘Abs’, ‘Atan2’, ‘Conj’, ‘Cuberoot’, ‘DoubleFactorial’, ‘Eps’, ‘Exp’, ‘Exp2’, ‘ExpMinus1’, ‘Factorial’, ‘Hypotenuse’, ‘ImaginaryPart’, ‘Ln’, ‘LnPlus1’, ‘Log’, ‘Log10’, ‘Log2’, ‘NthRoot’, ‘Pow’, ‘RealPart’, ‘Sqrt’, ‘SqrtPi’, ‘Square’])
basic_arg_type (_AzuremlApplyMathOperationBasicArgTypeEnum) – enum (optional, enum: [‘Constant’, ‘ColumnSet’])
basic_constant (float) – float (optional)
basic_column_selector (str) – ColumnPicker (optional)
compare_func (_AzuremlApplyMathOperationCompareFuncEnum) – enum (optional, enum: [‘EqualTo’, ‘GreaterThan’, ‘GreaterThanOrEqualTo’, ‘LessThan’, ‘LessThanOrEqualTo’, ‘NotEqualTo’, ‘PairMax’, ‘PairMin’])
compare_arg_type (_AzuremlApplyMathOperationCompareArgTypeEnum) – enum (optional, enum: [‘Constant’, ‘ColumnSet’])
compare_constant (float) – float (optional)
compare_column_selector (str) – ColumnPicker (optional)
operations_func (_AzuremlApplyMathOperationOperationsFuncEnum) – enum (optional, enum: [‘Add’, ‘Divide’, ‘Multiply’, ‘Subtract’])
operations_arg_type (_AzuremlApplyMathOperationOperationsArgTypeEnum) – enum (optional, enum: [‘Constant’, ‘ColumnSet’])
operations_constant (float) – float (optional)
operations_column_selector (str) – ColumnPicker (optional)
rounding_func (_AzuremlApplyMathOperationRoundingFuncEnum) – enum (optional, enum: [‘Ceiling’, ‘CeilingPower2’, ‘Floor’, ‘Mod’, ‘Quotient’, ‘Remainder’, ‘RoundDigits’, ‘RoundDown’, ‘RoundUp’, ‘ToEven’, ‘ToMultiple’, ‘ToOdd’, ‘Truncate’])
rounding_arg_type (_AzuremlApplyMathOperationRoundingArgTypeEnum) – enum (optional, enum: [‘Constant’, ‘ColumnSet’])
rounding_constant (float) – float (optional)
rounding_column_selector (str) – ColumnPicker (optional)
special_func (_AzuremlApplyMathOperationSpecialFuncEnum) – enum (optional, enum: [‘Beta’, ‘BetaLn’, ‘EllipticIntegralE’, ‘EllipticIntegralK’, ‘Erf’, ‘Erfc’, ‘ErfcScaled’, ‘ErfInverse’, ‘ExponentialIntegralEin’, ‘Gamma’, ‘GammaLn’, ‘GammaRegularizedP’, ‘GammaRegularizedPInverse’, ‘GammaRegularizedQ’, ‘GammaRegularizedQInverse’, ‘Polygamma’])
special_arg_type (_AzuremlApplyMathOperationSpecialArgTypeEnum) – enum (optional, enum: [‘Constant’, ‘ColumnSet’])
special_constant (float) – float (optional)
special_column_selector (str) – ColumnPicker (optional)
trigonometric_func (_AzuremlApplyMathOperationTrigonometricFuncEnum) – enum (optional, enum: [‘Acos’, ‘AcosDegrees’, ‘Acosh’, ‘Acot’, ‘AcotDegrees’, ‘Acoth’, ‘Acsc’, ‘AcscDegrees’, ‘Acsch’, ‘Arg’, ‘Asec’, ‘AsecDegrees’, ‘Asech’, ‘Asin’, ‘AsinDegrees’, ‘Asinh’, ‘Atan’, ‘AtanDegrees’, ‘Atanh’, ‘Cis’, ‘Cos’, ‘CosDegrees’, ‘Cosh’, ‘Cot’, ‘CotDegrees’, ‘Coth’, ‘Csc’, ‘CscDegrees’, ‘Csch’, ‘DegreesToRadians’, ‘RadiansToDegrees’, ‘Sec’, ‘SecDegrees’, ‘Sech’, ‘Sign’, ‘Sin’, ‘Sinc’, ‘SinDegrees’, ‘Sinh’, ‘Tan’, ‘TanDegrees’, ‘Tanh’])
column_selector (str) – ColumnPicker
output_mode (_AzuremlApplyMathOperationOutputModeEnum) – enum (enum: [‘Append’, ‘Inplace’, ‘ResultOnly’])
- Output result_dataset
DataFrameDirectory
- Type
result_dataset: Output
- example.assets.example_components.azureml_apply_sql_transformation(t1: Optional[pathlib.Path] = None, t2: Optional[pathlib.Path] = None, t3: Optional[pathlib.Path] = None, sqlquery: str = 'select * from t1') example.assets.example_components._assets._AzuremlApplySqlTransformationComponent
Runs a SQLite query on input datasets to transform the data.
- Parameters
t1 (Path) – DataFrameDirectory
t2 (Path) – DataFrameDirectory(optional)
t3 (Path) – DataFrameDirectory(optional)
sqlquery (str) – Script
- Output result_dataset
DataFrameDirectory
- Type
result_dataset: Output
- example.assets.example_components.azureml_apply_transformation(transformation: Optional[pathlib.Path] = None, dataset: Optional[pathlib.Path] = None) example.assets.example_components._assets._AzuremlApplyTransformationComponent
Applies a well-specified data transformation to a dataset.
- Parameters
transformation (Path) – A unary data transformation
dataset (Path) – Dataset to be transformed
- Output transformed_dataset
Transformed dataset
- Type
transformed_dataset: Output
- example.assets.example_components.azureml_assign_data_to_clusters(trained_model: Optional[pathlib.Path] = None, dataset: Optional[pathlib.Path] = None, check_for_append_or_uncheck_for_result_only: bool = True) example.assets.example_components._assets._AzuremlAssignDataToClustersComponent
Assign data to clusters using an existing trained clustering model.
- Parameters
trained_model (Path) – Trained clustering model
dataset (Path) – Input data source
check_for_append_or_uncheck_for_result_only (bool) – Whether output dataset must contain input dataset appended by assignments column (Checked) or assignments column only (Unchecked)
- Output results_dataset
Input dataset appended by data column of assignments or assignments column only
- Type
results_dataset: Output
- example.assets.example_components.azureml_boosted_decision_tree_regression(create_trainer_mode: example.assets.example_components._assets._AzuremlBoostedDecisionTreeRegressionCreateTrainerModeEnum = _AzuremlBoostedDecisionTreeRegressionCreateTrainerModeEnum.singleparameter, maximum_number_of_leaves_per_tree: int = 20, minimum_number_of_training_instances_required_to_form_a_leaf: int = 10, the_learning_rate: float = 0.2, total_number_of_trees_constructed: int = 100, range_for_maximum_number_of_leaves_per_tree: str = '2; 8; 32; 128', range_for_minimum_number_of_training_instances_required_to_form_a_leaf: str = '1; 10; 50', range_for_learning_rate: str = '0.025; 0.05; 0.1; 0.2; 0.4', range_for_total_number_of_trees_constructed: str = '20; 100; 500', random_number_seed: Optional[int] = None) example.assets.example_components._assets._AzuremlBoostedDecisionTreeRegressionComponent
Creates a regression model using the Boosted Decision Tree algorithm.
- Parameters
create_trainer_mode (_AzuremlBoostedDecisionTreeRegressionCreateTrainerModeEnum) – Create advanced learner options (enum: [‘SingleParameter’, ‘ParameterRange’])
maximum_number_of_leaves_per_tree (int) – Specify the maximum number of leaves per tree (optional, min: 2, max: 131072)
minimum_number_of_training_instances_required_to_form_a_leaf (int) – Specify the minimum number of cases required to form a leaf node (optional, min: 1)
the_learning_rate (float) – Specify the initial learning rate (optional, min: 2.220446049250313e-16, max: 1.0)
total_number_of_trees_constructed (int) – Specify the maximum number of trees that can be created during training (optional, min: 1)
range_for_maximum_number_of_leaves_per_tree (str) – Specify range for the maximum number of leaves allowed per tree (optional)
range_for_minimum_number_of_training_instances_required_to_form_a_leaf (str) – Specify the range for the minimum number of cases required to form a leaf (optional)
range_for_learning_rate (str) – Specify the range for the initial learning rate (optional)
range_for_total_number_of_trees_constructed (str) – Specify the range for the maximum number of trees that can be created during training (optional)
random_number_seed (int) – Provide a seed for the random number generator used by the model. Leave blank for default. (optional, max: 4294967295)
- Output untrained_model
An untrained regression model that can be connected to the Train Generic Model or Cross Validate Model modules
- Type
untrained_model: Output
- example.assets.example_components.azureml_clean_missing_data(dataset: Optional[pathlib.Path] = None, columns_to_be_cleaned: Optional[str] = None, minimum_missing_value_ratio: float = 0.0, maximum_missing_value_ratio: float = 1.0, cleaning_mode: example.assets.example_components._assets._AzuremlCleanMissingDataCleaningModeEnum = _AzuremlCleanMissingDataCleaningModeEnum.custom_substitution_value, replacement_value: str = '0', generate_missing_value_indicator_column: bool = False, cols_with_all_missing_values: example.assets.example_components._assets._AzuremlCleanMissingDataColsWithAllMissingValuesEnum = _AzuremlCleanMissingDataColsWithAllMissingValuesEnum.remove) example.assets.example_components._assets._AzuremlCleanMissingDataComponent
Specifies how to handle the values missing from a dataset.
- Parameters
dataset (Path) – Dataset to be cleaned
columns_to_be_cleaned (str) – Columns for missing values clean operation
minimum_missing_value_ratio (float) – Clean only column with missing value ratio above specified value, out of set of all selected columns (max: 1.0)
maximum_missing_value_ratio (float) – Clean only columns with missing value ratio below specified value, out of set of all selected columns (max: 1.0)
cleaning_mode (_AzuremlCleanMissingDataCleaningModeEnum) – Algorithm to clean missing values (enum: [‘Custom substitution value’, ‘Replace with mean’, ‘Replace with median’, ‘Replace with mode’, ‘Remove entire row’, ‘Remove entire column’])
replacement_value (str) – Type the value that takes the place of missing values (optional)
generate_missing_value_indicator_column (bool) – Generate a column that indicates which rows were cleaned (optional)
cols_with_all_missing_values (_AzuremlCleanMissingDataColsWithAllMissingValuesEnum) – Cols with all missing values (optional, enum: [‘Propagate’, ‘Remove’])
- Output cleaned_dataset
Cleaned dataset
- Type
cleaned_dataset: Output
- Output cleaning_transformation
Transformation to be passed to Apply Transformation module to clean new data
- Type
cleaning_transformation: Output
- example.assets.example_components.azureml_clip_values(input: Optional[pathlib.Path] = None, clipmode: example.assets.example_components._assets._AzuremlClipValuesClipmodeEnum = _AzuremlClipValuesClipmodeEnum.clippeaks, upperthreshold: example.assets.example_components._assets._AzuremlClipValuesUpperthresholdEnum = _AzuremlClipValuesUpperthresholdEnum.constant, constantupperthreshold: float = 99, percentileupperthreshold: float = 99, modeuppersubstitute: example.assets.example_components._assets._AzuremlClipValuesModeuppersubstituteEnum = _AzuremlClipValuesModeuppersubstituteEnum.threshold, lowerthreshold: example.assets.example_components._assets._AzuremlClipValuesLowerthresholdEnum = _AzuremlClipValuesLowerthresholdEnum.constant, constantlowerthreshold: float = 1, percentilelowerthreshold: float = 1, modeowersubstitute: example.assets.example_components._assets._AzuremlClipValuesModeowersubstituteEnum = _AzuremlClipValuesModeowersubstituteEnum.threshold, lowerupperthreshold: example.assets.example_components._assets._AzuremlClipValuesLowerupperthresholdEnum = _AzuremlClipValuesLowerupperthresholdEnum.constant, constantuthreshold: float = 99, constantlthreshold: float = 1, percentileuthreshold: float = 99, percentilelthreshold: float = 1, modeusubstitute: example.assets.example_components._assets._AzuremlClipValuesModeusubstituteEnum = _AzuremlClipValuesModeusubstituteEnum.threshold, modelsubstitute: example.assets.example_components._assets._AzuremlClipValuesModelsubstituteEnum = _AzuremlClipValuesModelsubstituteEnum.threshold, column_selector: Optional[str] = None, inplace_flag: bool = True, indicator_flag: bool = False) example.assets.example_components._assets._AzuremlClipValuesComponent
Detects outliers and clips or replaces their values.
- Parameters
input (Path) – DataFrameDirectory
clipmode (_AzuremlClipValuesClipmodeEnum) – enum (enum: [‘ClipPeaks’, ‘ClipSubPeaks’, ‘ClipPeaksAndSubpeaks’])
upperthreshold (_AzuremlClipValuesUpperthresholdEnum) – enum (optional, enum: [‘Constant’, ‘Percentile’])
constantupperthreshold (float) – float (optional)
percentileupperthreshold (float) – float (optional)
modeuppersubstitute (_AzuremlClipValuesModeuppersubstituteEnum) – enum (optional, enum: [‘Threshold’, ‘Mean’, ‘Median’, ‘Missing’])
lowerthreshold (_AzuremlClipValuesLowerthresholdEnum) – enum (optional, enum: [‘Constant’, ‘Percentile’])
constantlowerthreshold (float) – float (optional)
percentilelowerthreshold (float) – float (optional)
modeowersubstitute (_AzuremlClipValuesModeowersubstituteEnum) – enum (optional, enum: [‘Threshold’, ‘Mean’, ‘Median’, ‘Missing’])
lowerupperthreshold (_AzuremlClipValuesLowerupperthresholdEnum) – enum (optional, enum: [‘Constant’, ‘Percentile’])
constantuthreshold (float) – float (optional)
constantlthreshold (float) – float (optional)
percentileuthreshold (float) – float (optional)
percentilelthreshold (float) – float (optional)
modeusubstitute (_AzuremlClipValuesModeusubstituteEnum) – enum (optional, enum: [‘Threshold’, ‘Mean’, ‘Median’, ‘Missing’])
modelsubstitute (_AzuremlClipValuesModelsubstituteEnum) – enum (optional, enum: [‘Threshold’, ‘Mean’, ‘Median’, ‘Missing’])
column_selector (str) – ColumnPicker
inplace_flag (bool) – boolean
indicator_flag (bool) – boolean
- Output result_dataset
DataFrameDirectory
- Type
result_dataset: Output
- example.assets.example_components.azureml_convert_to_csv(dataset: Optional[pathlib.Path] = None) example.assets.example_components._assets._AzuremlConvertToCsvComponent
Converts data input to a comma-separated values format.
- Parameters
dataset (Path) – Input dataset
- Output results_dataset
Output dataset
- Type
results_dataset: Output
- example.assets.example_components.azureml_convert_to_dataset(dataset: Optional[pathlib.Path] = None, action: example.assets.example_components._assets._AzuremlConvertToDatasetActionEnum = _AzuremlConvertToDatasetActionEnum.none, custom_missing_value: str = '?', replace: example.assets.example_components._assets._AzuremlConvertToDatasetReplaceEnum = _AzuremlConvertToDatasetReplaceEnum.missing, custom_value: str = 'obs', new_value: str = '0') example.assets.example_components._assets._AzuremlConvertToDatasetComponent
Converts data input to the internal Dataset format used by Azure Machine Learning designer.
- Parameters
dataset (Path) – Input dataset
action (_AzuremlConvertToDatasetActionEnum) – Action to apply to input dataset (enum: [‘None’, ‘SetMissingValues’, ‘ReplaceValues’])
custom_missing_value (str) – Value indicating missing value token (optional)
replace (_AzuremlConvertToDatasetReplaceEnum) – Specifies type of replacement for values (optional, enum: [‘Missing’, ‘Custom’])
custom_value (str) – Value to be replaced (optional)
new_value (str) – Replacement value (optional)
- Output results_dataset
Output dataset
- Type
results_dataset: Output
- example.assets.example_components.azureml_convert_to_image_directory(input_dataset: Optional[pathlib.Path] = None) example.assets.example_components._assets._AzuremlConvertToImageDirectoryComponent
Convert dataset to image directory format.
- Parameters
input_dataset (Path) – Input dataset
- Output output_image_directory
Output image directory.
- Type
output_image_directory: Output
- example.assets.example_components.azureml_convert_to_indicator_values(dataset: Optional[pathlib.Path] = None, categorical_columns_to_convert: Optional[str] = None, overwrite_categorical_columns: bool = False) example.assets.example_components._assets._AzuremlConvertToIndicatorValuesComponent
Converts categorical values in columns to indicator values.
- Parameters
dataset (Path) – Dataset with categorical columns
categorical_columns_to_convert (str) – Select categorical columns to convert to indicator matrices.
overwrite_categorical_columns (bool) – If True, overwrite the selected categorical columns, otherwise append the resulting indicator matrices to the dataset (optional)
- Output results_dataset
Dataset with categorical columns converted to indicator matrices.
- Type
results_dataset: Output
- Output indicator_values_transformation
Transformation to be passed to Apply Transformation module to convert indicator values for new data
- Type
indicator_values_transformation: Output
- example.assets.example_components.azureml_convert_word_to_vector(dataset: Optional[pathlib.Path] = None, target_column: Optional[str] = None, word2vec_strategy: example.assets.example_components._assets._AzuremlConvertWordToVectorWord2VecStrategyEnum = _AzuremlConvertWordToVectorWord2VecStrategyEnum.gensim_word2vec, word2vec_training_algorithm: example.assets.example_components._assets._AzuremlConvertWordToVectorWord2VecTrainingAlgorithmEnum = _AzuremlConvertWordToVectorWord2VecTrainingAlgorithmEnum.skip_gram, length_of_word_embedding: int = 100, context_window_size: int = 5, number_of_epochs: int = 5, maximum_vocabulary_size: int = 10000, minimum_word_count: int = 5) example.assets.example_components._assets._AzuremlConvertWordToVectorComponent
Convert word to vector.
- Parameters
dataset (Path) – Input data
target_column (str) – Select one target column whose vocabulary embeddings will be generated
word2vec_strategy (_AzuremlConvertWordToVectorWord2VecStrategyEnum) – Select the strategy for computing word embedding (enum: [‘GloVe pretrained English Model’, ‘Gensim Word2Vec’, ‘Gensim FastText’])
word2vec_training_algorithm (_AzuremlConvertWordToVectorWord2VecTrainingAlgorithmEnum) – Select the training algorithm for training Word2Vec model (optional, enum: [‘Skip_gram’, ‘CBOW’])
length_of_word_embedding (int) – Specify the length of the word embedding/vector (optional, min: 10, max: 2000)
context_window_size (int) – Specify the maximum distance between the word being predicted and the current word (optional, min: 1, max: 100)
number_of_epochs (int) – Specify the number of epochs (iterations) over the corpus (optional, min: 1, max: 1024)
maximum_vocabulary_size (int) – Specify the maximum number of the words in vocabulary (min: 10, max: 2147483647)
minimum_word_count (int) – Ignores all words that have a frequency lower than this value (min: 1, max: 100)
- Output vocabulary_with_embeddings
Vocabulary with embeddings
- Type
vocabulary_with_embeddings: Output
- example.assets.example_components.azureml_create_python_model(python_script: str = '\n# The script MUST define a class named AzureMLModel.\n# This class MUST at least define the following three methods: "__init__", "train" and "predict".\n# The signatures (method and argument names) of all these methods MUST be exactly the same as the following example.\n\n# Please do not install extra packages such as "pip install xgboost" in this script,\n# otherwise errors will be raised when reading models in down-stream modules.\n\nimport pandas as pd\nfrom sklearn.linear_model import LogisticRegression\n\n\nclass AzureMLModel:\n # The __init__ method is only invoked in module "Create Python Model",\n # and will not be invoked again in the following modules "Train Model" and "Score Model".\n # The attributes defined in the __init__ method are preserved and usable in the train and predict method.\n def __init__(self):\n # self.model must be assigned\n self.model = LogisticRegression()\n self.feature_column_names = list()\n\n # Train model\n # Param<df_train>: a pandas.DataFrame\n # Param<df_label>: a pandas.Series\n def train(self, df_train, df_label):\n # self.feature_column_names records the column names used for training.\n # It is recommended to set this attribute before training so that the\n # feature columns used in predict and train methods have the same names.\n self.feature_column_names = df_train.columns.tolist()\n self.model.fit(df_train, df_label)\n\n # Predict results\n # Param<df>: a pandas.DataFrame\n # Must return a pandas.DataFrame\n def predict(self, df):\n # The feature columns used for prediction MUST have the same names as the ones for training.\n # The name of score column ("Scored Labels" in this case) MUST be different from any other\n # columns in input data.\n return pd.DataFrame({\'Scored Labels\': self.model.predict(df[self.feature_column_names])})\n') example.assets.example_components._assets._AzuremlCreatePythonModelComponent
Creates Python model using custom script.
- Parameters
python_script (str) – The Python script to execute
- Output untrained_model
A untrained custom python model
- Type
untrained_model: Output
- example.assets.example_components.azureml_cross_validate_model(untrained_model: Optional[pathlib.Path] = None, dataset: Optional[pathlib.Path] = None, name_or_numerical_index_of_the_label_column: Optional[str] = None, random_seed: int = 0) example.assets.example_components._assets._AzuremlCrossValidateModelComponent
Cross Validate a classification or regression model with standard metrics.
- Parameters
untrained_model (Path) – Untrained learner
dataset (Path) – Training data
name_or_numerical_index_of_the_label_column (str) – Select the column that contains the label or outcome column
random_seed (int) – Specify a numeric seed to use for random number generation. (max: 4294967295)
- Output scored_results
Data scored results
- Type
scored_results: Output
- Output evaluation_results_by_fold
Data evaluation results by fold
- Type
evaluation_results_by_fold: Output
- example.assets.example_components.azureml_decision_forest_regression(create_trainer_mode: example.assets.example_components._assets._AzuremlDecisionForestRegressionCreateTrainerModeEnum = _AzuremlDecisionForestRegressionCreateTrainerModeEnum.singleparameter, number_of_decision_trees: int = 8, maximum_depth_of_the_decision_trees: int = 32, minimum_number_of_samples_per_leaf_node: int = 1, range_for_number_of_decision_trees: str = '1; 8; 32', range_for_the_maximum_depth_of_the_decision_trees: str = '1; 16; 64', range_for_the_minimum_number_of_samples_per_leaf_node: str = '1; 4; 16', resampling_method: example.assets.example_components._assets._AzuremlDecisionForestRegressionResamplingMethodEnum = _AzuremlDecisionForestRegressionResamplingMethodEnum.bagging_resampling) example.assets.example_components._assets._AzuremlDecisionForestRegressionComponent
Creates a regression model using the decision forest algorithm.
- Parameters
create_trainer_mode (_AzuremlDecisionForestRegressionCreateTrainerModeEnum) – Create advanced learner options (enum: [‘SingleParameter’, ‘ParameterRange’])
number_of_decision_trees (int) – Specify the number of decision trees to create in the ensemble (optional, min: 1)
maximum_depth_of_the_decision_trees (int) – Specify the maximum depth of any decision tree that can be created in the ensemble (optional, min: 1)
minimum_number_of_samples_per_leaf_node (int) – Specify the minimum number of training samples required to generate a leaf node (optional, min: 1)
range_for_number_of_decision_trees (str) – Specify range for the number of decision trees to create in the ensemble (optional)
range_for_the_maximum_depth_of_the_decision_trees (str) – Specify range for the maximum depth of the decision trees (optional)
range_for_the_minimum_number_of_samples_per_leaf_node (str) – Specify range for the minimum number of samples per leaf node (optional)
resampling_method (_AzuremlDecisionForestRegressionResamplingMethodEnum) – Choose a resampling method (enum: [‘Bagging Resampling’, ‘Replicate Resampling’])
- Output untrained_model
An untrained regression model
- Type
untrained_model: Output
- example.assets.example_components.azureml_densenet(model_name: example.assets.example_components._assets._AzuremlDensenetModelNameEnum = _AzuremlDensenetModelNameEnum.densenet201, pretrained: bool = True, memory_efficient: bool = False) example.assets.example_components._assets._AzuremlDensenetComponent
Creates a image classification model using the densenet algorithm.
- Parameters
model_name (_AzuremlDensenetModelNameEnum) – Name of a certain densenet structure (enum: [‘densenet121’, ‘densenet161’, ‘densenet169’, ‘densenet201’])
pretrained (bool) – Indicate whether to use a model pre-trained on ImageNet
memory_efficient (bool) – Indicate whether to use checkpointing, which is much more memory efficient but slower
- Output untrained_model
Untrained densenet model path
- Type
untrained_model: Output
- example.assets.example_components.azureml_edit_metadata(dataset: Optional[pathlib.Path] = None, column: Optional[str] = None, data_type: example.assets.example_components._assets._AzuremlEditMetadataDataTypeEnum = _AzuremlEditMetadataDataTypeEnum.unchanged, date_and_time_format: Optional[str] = None, categorical: example.assets.example_components._assets._AzuremlEditMetadataCategoricalEnum = _AzuremlEditMetadataCategoricalEnum.unchanged, fields: example.assets.example_components._assets._AzuremlEditMetadataFieldsEnum = _AzuremlEditMetadataFieldsEnum.unchanged, new_column_name: Optional[str] = None) example.assets.example_components._assets._AzuremlEditMetadataComponent
Edits metadata associated with columns in a dataset.
- Parameters
dataset (Path) – Input dataset
column (str) – Choose the columns to which your changes should apply
data_type (_AzuremlEditMetadataDataTypeEnum) – Specify the new data type of the column (enum: [‘Unchanged’, ‘String’, ‘Integer’, ‘Double’, ‘Boolean’, ‘DateTime’])
date_and_time_format (str) – Specify custom format string for parsing DateTime, refer to Python standard library datetime.strftime() for detailed documentation. Leave empty for default permissive parsing (optional)
categorical (_AzuremlEditMetadataCategoricalEnum) – Indicate whether the column should be flagged as categorical (enum: [‘Unchanged’, ‘Categorical’, ‘NonCategorical’])
fields (_AzuremlEditMetadataFieldsEnum) – Specify whether the column should be considered a feature or label by learning algorithms (enum: [‘Unchanged’, ‘Features’, ‘Labels’, ‘ClearFeatures’, ‘ClearLabels’, ‘ClearScores’])
new_column_name (str) – Type the new names of the columns (optional)
- Output results_dataset
Dataset with changed metadata
- Type
results_dataset: Output
- example.assets.example_components.azureml_enter_data_manually(dataformat: example.assets.example_components._assets._AzuremlEnterDataManuallyDataformatEnum = _AzuremlEnterDataManuallyDataformatEnum.csv, hasheader: bool = True, data: Optional[str] = None) example.assets.example_components._assets._AzuremlEnterDataManuallyComponent
Enables entering and editing small datasets by typing values.
- Parameters
dataformat (_AzuremlEnterDataManuallyDataformatEnum) – Select which format data will be entered (enum: [‘ARFF’, ‘CSV’, ‘SvmLight’, ‘TSV’])
hasheader (bool) – CSV or TSV file has a header (optional)
data (str) – Text to output as DataTable
- Output dataset
Entered data
- Type
dataset: Output
- example.assets.example_components.azureml_evaluate_model(scored_dataset: Optional[pathlib.Path] = None, scored_dataset_to_compare: Optional[pathlib.Path] = None) example.assets.example_components._assets._AzuremlEvaluateModelComponent
Evaluates the results of a classification or regression model with standard metrics.
- Parameters
scored_dataset (Path) – Scored dataset
scored_dataset_to_compare (Path) – Scored dataset to compare (optional)(optional)
- Output evaluation_results
Data evaluation result
- Type
evaluation_results: Output
- example.assets.example_components.azureml_evaluate_recommender(test_dataset: Optional[pathlib.Path] = None, scored_dataset: Optional[pathlib.Path] = None) example.assets.example_components._assets._AzuremlEvaluateRecommenderComponent
Evaluate a recommendation model.
- Parameters
test_dataset (Path) – Test dataset
scored_dataset (Path) – Scored dataset
- Output metric
A table of evaluation metrics
- Type
metric: Output
- example.assets.example_components.azureml_execute_python_script(dataset1: Optional[pathlib.Path] = None, dataset2: Optional[pathlib.Path] = None, script_bundle: Optional[pathlib.Path] = None, python_script: str = '\n# The script MUST contain a function named azureml_main\n# which is the entry point for this module.\n\n# imports up here can be used to\nimport pandas as pd\n\n# The entry point function MUST have two input arguments.\n# If the input port is not connected, the corresponding\n# dataframe argument will be None.\n# Param<dataframe1>: a pandas.DataFrame\n# Param<dataframe2>: a pandas.DataFrame\ndef azureml_main(dataframe1 = None, dataframe2 = None):\n\n # Execution logic goes here\n print(f\'Input pandas.DataFrame #1: {dataframe1}\')\n\n # If a zip file is connected to the third input port,\n # it is unzipped under "./Script Bundle". This directory is added\n # to sys.path. Therefore, if your zip file contains a Python file\n # mymodule.py you can import it using:\n # import mymodule\n\n # Return value must be of a sequence of pandas.DataFrame\n # E.g.\n # - Single return value: return dataframe1,\n # - Two return values: return dataframe1, dataframe2\n return dataframe1,\n\n') example.assets.example_components._assets._AzuremlExecutePythonScriptComponent
Executes a Python script from an Azure Machine Learning designer pipeline.
- Parameters
dataset1 (Path) – Input dataset 1(optional)
dataset2 (Path) – Input dataset 2(optional)
script_bundle (Path) – Zip file containing custom resources(optional)
python_script (str) – The Python script to execute
- Output result_dataset
Output Dataset
- Type
result_dataset: Output
- Output python_device
Output Dataset2
- Type
python_device: Output
- example.assets.example_components.azureml_execute_r_script(dataset1: Optional[pathlib.Path] = None, dataset2: Optional[pathlib.Path] = None, script_bundle: Optional[pathlib.Path] = None, r_script: str = '\n# R version: 3.5.1\n# The script MUST contain a function named azureml_main\n# which is the entry point for this module.\n\n# Please note that functions dependant on X11 library\n# such as "View" are not supported because X11 library\n# is not pre-installed.\n\n# The entry point function MUST have two input arguments.\n# If the input port is not connected, the corresponding\n# dataframe argument will be null.\n# Param<dataframe1>: a R DataFrame\n# Param<dataframe2>: a R DataFrame\nazureml_main <- function(dataframe1, dataframe2){\n print("R script run.")\n\n # If a zip file is connected to the third input port, it is\n # unzipped under "./Script Bundle". This directory is added\n # to sys.path.\n\n # Return datasets as a Named List\n return(list(dataset1=dataframe1, dataset2=dataframe2))\n}\n\n', random_seed: Optional[int] = None) example.assets.example_components._assets._AzuremlExecuteRScriptComponent
Executes an R script from an Azure Machine Learning designer pipeline.
- Parameters
dataset1 (Path) – Input dataset 1(optional)
dataset2 (Path) – Input dataset 2(optional)
script_bundle (Path) – Set of R sources(optional)
r_script (str) – Specify a StreamReader pointing to the R script sources
random_seed (int) – Define a random seed value for use inside the R environment. Calls “set.seed(value)” (optional)
- Output result_dataset
Output Dataset
- Type
result_dataset: Output
- Output r_device
Output Dataset2
- Type
r_device: Output
- example.assets.example_components.azureml_export_data(input_path: Optional[pathlib.Path] = None, datastore_type: Optional[str] = None, output_data_store: Optional[str] = None, output_path: Optional[str] = None, output_file_type: Optional[str] = None, datatable_name: Optional[str] = None, column_list_to_be_saved: Optional[str] = None, column_list_datatable_columns: Optional[str] = None, number_rows_per_operation: int = 50) example.assets.example_components._assets._AzuremlExportDataComponent
Writes a dataset to cloud-based storage in Azure, such as Azure blob storage, Azure Data Lake Storage Gen1, Azure Data Lake Storage Gen2.
- Parameters
input_path (Path) – export data
datastore_type (str) – datastore type (optional)
output_data_store (str) – the location of output data store
output_path (str) – the relative output path in the data store (optional)
output_file_type (str) – the file type to be outputted (optional)
datatable_name (str) – export data table name (optional)
column_list_to_be_saved (str) – selected column(s) to be exported (optional)
column_list_datatable_columns (str) – column names in export data table (optional)
number_rows_per_operation (int) – number of rows per operation (optional)
- example.assets.example_components.azureml_extract_n_gram_features_from_text(dataset: Optional[pathlib.Path] = None, input_vocabulary: Optional[pathlib.Path] = None, text_column: Optional[str] = None, vocabulary_mode: example.assets.example_components._assets._AzuremlExtractNGramFeaturesFromTextVocabularyModeEnum = _AzuremlExtractNGramFeaturesFromTextVocabularyModeEnum.create, n_grams_size: int = 1, weighting_function: example.assets.example_components._assets._AzuremlExtractNGramFeaturesFromTextWeightingFunctionEnum = _AzuremlExtractNGramFeaturesFromTextWeightingFunctionEnum.binary_weight, minimum_word_length: int = 3, maximum_word_length: int = 25, minimum_n_gram_document_absolute_frequency: float = 5, maximum_n_gram_document_ratio: float = 1, normalize_n_gram_feature_vectors: bool = False) example.assets.example_components._assets._AzuremlExtractNGramFeaturesFromTextComponent
Creates N-Gram dictionary features and does feature selection on them.
- Parameters
dataset (Path) – Input data
input_vocabulary (Path) – Input vocabulary(optional)
text_column (str) – Name or index (one-based) of text column
vocabulary_mode (_AzuremlExtractNGramFeaturesFromTextVocabularyModeEnum) – Specify how the n-gram vocabulary should be created from the corpus (enum: [‘Create’, ‘ReadOnly’])
n_grams_size (int) – Indicate the maximum size of n-grams to create (min: 1)
weighting_function (_AzuremlExtractNGramFeaturesFromTextWeightingFunctionEnum) – Choose the weighting function to apply to each n-gram value (enum: [‘Binary Weight’, ‘TF Weight’, ‘IDF Weight’, ‘TF-IDF Weight’])
minimum_word_length (int) – Specify the minimum length of words to include in n-grams (min: 1)
maximum_word_length (int) – Specify the maximum length of words to include in n-grams (min: 2)
minimum_n_gram_document_absolute_frequency (float) – Minimum n-gram document absolute frequency (min: 1.0)
maximum_n_gram_document_ratio (float) – Maximum n-gram document ratio (min: 0.0001)
normalize_n_gram_feature_vectors (bool) – Normalize n-gram feature vectors. If true, then the n-gram feature vector is divided by its L2 norm.
- Output results_dataset
Extracted features
- Type
results_dataset: Output
- Output result_vocabulary
Result vocabulary
- Type
result_vocabulary: Output
- example.assets.example_components.azureml_fast_forest_quantile_regression(create_trainer_mode: example.assets.example_components._assets._AzuremlFastForestQuantileRegressionCreateTrainerModeEnum = _AzuremlFastForestQuantileRegressionCreateTrainerModeEnum.singleparameter, number_of_trees: int = 100, number_of_leaves: int = 20, minimum_number_of_training_instances_required_to_form_a_leaf: int = 10, bagging_fraction: float = 0.7, split_fraction: float = 0.7, quantiles_to_be_estimated: str = '0.25; 0.5; 0.75', range_for_total_number_of_trees_constructed: str = '16; 32; 64', range_for_maximum_number_of_leaves_per_tree: str = '16; 32; 64', range_for_minimum_number_of_training_instances_required_to_form_a_leaf: str = '1; 5; 10', range_for_bagging_fraction: str = '0.25; 0.5; 0.75', range_for_split_fraction: str = '0.25; 0.5; 0.75', required_quantile_values: str = '0.25; 0.5; 0.75', random_number_seed: Optional[int] = None) example.assets.example_components._assets._AzuremlFastForestQuantileRegressionComponent
Creates a quantile regression model
- Parameters
create_trainer_mode (_AzuremlFastForestQuantileRegressionCreateTrainerModeEnum) – Create advanced learner options (enum: [‘SingleParameter’, ‘ParameterRange’])
number_of_trees (int) – Specifies the number of trees to be constructed (optional)
number_of_leaves (int) – Specifies the maximum number of leaves per tree. The default number is 20 (optional, min: 2)
minimum_number_of_training_instances_required_to_form_a_leaf (int) – Indicates the minimum number of training instances requried to form a leaf (optional)
bagging_fraction (float) – Specifies the fraction of training data to use for each tree (optional)
split_fraction (float) – Specifies the fraction of features (chosen randomly) to use for each split (optional)
quantiles_to_be_estimated (str) – Specifies the quantile to be estimated (optional)
range_for_total_number_of_trees_constructed (str) – Specify the range for the maximum number of trees that can be created during training (optional)
range_for_maximum_number_of_leaves_per_tree (str) – Specify range for the maximum number of leaves allowed per tree (optional)
range_for_minimum_number_of_training_instances_required_to_form_a_leaf (str) – Specify the range for the minimum number of cases required to form a leaf (optional)
range_for_bagging_fraction (str) – Specifies the range for fraction of training data to use for each tree (optional)
range_for_split_fraction (str) – Specifies the range for fraction of features (chosen randomly) to use for each split (optional)
required_quantile_values (str) – Required quantile value used during parameter sweep (optional)
random_number_seed (int) – Provide a seed for the random number generator used by the model. Leave blank for default. (optional, max: 4294967295)
- Output untrained_model
An untrained quantile regression model that can be connected to the Train Generic Model or Cross Validate Model modules.
- Type
untrained_model: Output
- example.assets.example_components.azureml_feature_hashing(dataset: Optional[pathlib.Path] = None, target_column: Optional[str] = None, hashing_bitsize: int = 10, n_grams: int = 2) example.assets.example_components._assets._AzuremlFeatureHashingComponent
Convert text data to numeric features using the nimbusml.
- Parameters
dataset (Path) – Input dataset
target_column (str) – Choose the columns to which hashing will be applied
hashing_bitsize (int) – Type the number of bits used to hash the selected columns (min: 1, max: 31)
n_grams (int) – Specify the number of N-grams generated during hashing (max: 10)
- Output transformed_dataset
Output dataset with hashed columns,the number of feature columns generated is related to the parameters(Hashing bitsize).
- Type
transformed_dataset: Output
- example.assets.example_components.azureml_filter_based_feature_selection(input_dataset: Optional[pathlib.Path] = None, operate_on_feature_columns_only: bool = True, target_column: Optional[str] = None, number_of_desired_features: int = 1, feature_scoring_method: example.assets.example_components._assets._AzuremlFilterBasedFeatureSelectionFeatureScoringMethodEnum = _AzuremlFilterBasedFeatureSelectionFeatureScoringMethodEnum.pearsoncorrelation) example.assets.example_components._assets._AzuremlFilterBasedFeatureSelectionComponent
Identifies the features in a dataset with the greatest predictive power.
- Parameters
input_dataset (Path) – Input dataset
operate_on_feature_columns_only (bool) – Indicate whether to use only feature columns in the scoring process (optional)
target_column (str) – Specify the target column
number_of_desired_features (int) – Specify the number of features to output in results
feature_scoring_method (_AzuremlFilterBasedFeatureSelectionFeatureScoringMethodEnum) – Choose the method to use for scoring (enum: [‘PearsonCorrelation’, ‘ChiSquared’])
- Output filtered_dataset
Filtered dataset
- Type
filtered_dataset: Output
- Output features
Names of output columns and feature selection scores
- Type
features: Output
- example.assets.example_components.azureml_group_data_into_bins(dataset: Optional[pathlib.Path] = None, binning_mode: example.assets.example_components._assets._AzuremlGroupDataIntoBinsBinningModeEnum = _AzuremlGroupDataIntoBinsBinningModeEnum.quantiles, number_of_bins: int = 10, quantile_normalization: example.assets.example_components._assets._AzuremlGroupDataIntoBinsQuantileNormalizationEnum = _AzuremlGroupDataIntoBinsQuantileNormalizationEnum.percent, comma_separated_list_of_bin_edges: Optional[str] = None, columns_to_bin: Optional[str] = None, output_mode: example.assets.example_components._assets._AzuremlGroupDataIntoBinsOutputModeEnum = _AzuremlGroupDataIntoBinsOutputModeEnum.append, tag_columns_as_categorical: bool = True) example.assets.example_components._assets._AzuremlGroupDataIntoBinsComponent
Map input values to a smaller number of bins using a quantization function.
- Parameters
dataset (Path) – Dataset to be analyzed
binning_mode (_AzuremlGroupDataIntoBinsBinningModeEnum) – Choose a binning method (enum: [‘Quantiles’, ‘Equal Width’, ‘Custom Edges’])
number_of_bins (int) – Specify the desired number of bins (optional, min: 1)
quantile_normalization (_AzuremlGroupDataIntoBinsQuantileNormalizationEnum) – Choose the method for normalizing quantiles (optional, enum: [‘Percent’, ‘PQuantile’, ‘Quantile Index’])
comma_separated_list_of_bin_edges (str) – Type a comma-separated list of numbers to use as bin edges (optional)
columns_to_bin (str) – Choose columns for quantization
output_mode (_AzuremlGroupDataIntoBinsOutputModeEnum) – Indicate how quantized columns should be output (enum: [‘Append’, ‘Inplace’, ‘Result Only’])
tag_columns_as_categorical (bool) – Indicate whether output columns should be tagged as categorical
- Output quantized_dataset
Dataset with quantized columns
- Type
quantized_dataset: Output
- Output binning_transformation
Transformation that applies quantization to the dataset
- Type
binning_transformation: Output
- example.assets.example_components.azureml_import_data(input_dataset_request_dto: Optional[str] = None, data_store_type: Optional[str] = None, override_data_store_name: Optional[str] = None, override_data_path: Optional[str] = None) example.assets.example_components._assets._AzuremlImportDataComponent
Load data from web URLs or from various cloud-based storage in Azure, such as Azure SQL database, Azure blob storage, Azure Data Lake Storage Gen1, Azure Data Lake Storage Gen2.
- Parameters
input_dataset_request_dto (str) – input dataset Id/Object
data_store_type (str) – data store type (optional)
override_data_store_name (str) – string (optional)
override_data_path (str) – string (optional)
- Output output_data
DataFrameDirectory
- Type
output_data: Output
- example.assets.example_components.azureml_init_image_transformation(resize: example.assets.example_components._assets._AzuremlInitImageTransformationResizeEnum = _AzuremlInitImageTransformationResizeEnum.true, size: int = 256, center_crop: example.assets.example_components._assets._AzuremlInitImageTransformationCenterCropEnum = _AzuremlInitImageTransformationCenterCropEnum.true, crop_size: int = 224, pad: example.assets.example_components._assets._AzuremlInitImageTransformationPadEnum = _AzuremlInitImageTransformationPadEnum.false, padding: int = 0, color_jitter: bool = False, grayscale: bool = False, random_resized_crop: example.assets.example_components._assets._AzuremlInitImageTransformationRandomResizedCropEnum = _AzuremlInitImageTransformationRandomResizedCropEnum.false, random_resized_crop_size: int = 256, random_crop: example.assets.example_components._assets._AzuremlInitImageTransformationRandomCropEnum = _AzuremlInitImageTransformationRandomCropEnum.false, random_crop_size: int = 224, random_horizontal_flip: bool = True, random_vertical_flip: bool = False, random_rotation: example.assets.example_components._assets._AzuremlInitImageTransformationRandomRotationEnum = _AzuremlInitImageTransformationRandomRotationEnum.false, random_rotation_degrees: int = 0, random_affine: example.assets.example_components._assets._AzuremlInitImageTransformationRandomAffineEnum = _AzuremlInitImageTransformationRandomAffineEnum.false, random_affine_degrees: int = 0, random_grayscale: bool = False, random_perspective: bool = False) example.assets.example_components._assets._AzuremlInitImageTransformationComponent
Initialize image transformation.
- Parameters
resize (_AzuremlInitImageTransformationResizeEnum) – Resize the input PIL Image to the given size (enum: [‘False’, ‘True’])
size (int) – Desired output size (optional, min: 1)
center_crop (_AzuremlInitImageTransformationCenterCropEnum) – Crops the given PIL Image at the center (enum: [‘False’, ‘True’])
crop_size (int) – Desired output size of the crop (optional, min: 1)
pad (_AzuremlInitImageTransformationPadEnum) – Pad the given PIL Image on all sides with the given “pad” value (enum: [‘False’, ‘True’])
padding (int) – Padding on each border (optional)
color_jitter (bool) – Randomly change the brightness, contrast and saturation of an image
grayscale (bool) – Convert image to grayscale
random_resized_crop (_AzuremlInitImageTransformationRandomResizedCropEnum) – Crop the given PIL Image to random size and aspect ratio (enum: [‘False’, ‘True’])
random_resized_crop_size (int) – Expected output size of each edge (optional, min: 1)
random_crop (_AzuremlInitImageTransformationRandomCropEnum) – Crop the given PIL Image at a random location (enum: [‘False’, ‘True’])
random_crop_size (int) – Desired output size of the crop (optional, min: 1)
random_horizontal_flip (bool) – Horizontally flip the given PIL Image randomly with a given probability
random_vertical_flip (bool) – Vertically flip the given PIL Image randomly with a given probability
random_rotation (_AzuremlInitImageTransformationRandomRotationEnum) – Rotate the image by angle (enum: [‘False’, ‘True’])
random_rotation_degrees (int) – Range of degrees to select from (optional, max: 180)
random_affine (_AzuremlInitImageTransformationRandomAffineEnum) – Random affine transformation of the image keeping center invariant (enum: [‘False’, ‘True’])
random_affine_degrees (int) – Range of degrees to select from (optional, max: 180)
random_grayscale (bool) – Randomly convert image to grayscale with a probability of p (default 0.1)
random_perspective (bool) – Performs Perspective transformation of the given PIL Image randomly with a given probability
- Output output_image_transformation
Output image transformation
- Type
output_image_transformation: Output
- example.assets.example_components.azureml_join_data(left_dataset: Optional[pathlib.Path] = None, right_dataset: Optional[pathlib.Path] = None, comma_separated_case_sensitive_names_of_join_key_columns_for_l: Optional[str] = None, comma_separated_case_sensitive_names_of_join_key_columns_for_r: Optional[str] = None, match_case: bool = True, join_type: example.assets.example_components._assets._AzuremlJoinDataJoinTypeEnum = _AzuremlJoinDataJoinTypeEnum.inner_join, keep_right_key_columns_in_joined_table: bool = True) example.assets.example_components._assets._AzuremlJoinDataComponent
Joins two datasets on selected key columns.
- Parameters
left_dataset (Path) – First dataset to join
right_dataset (Path) – Second dataset to join
comma_separated_case_sensitive_names_of_join_key_columns_for_l (str) – Select the join key columns for the first dataset
comma_separated_case_sensitive_names_of_join_key_columns_for_r (str) – Select the join key columns for the second dataset
match_case (bool) – Indicate whether a case-sensitive comparison is allowed on key columns
join_type (_AzuremlJoinDataJoinTypeEnum) – Choose a join type (enum: [‘Inner Join’, ‘Left Outer Join’, ‘Full Outer Join’, ‘Left Semi-Join’])
keep_right_key_columns_in_joined_table (bool) – Indicate whether to keep key columns from the second dataset in the joined dataset (optional)
- Output results_dataset
Result of join operation
- Type
results_dataset: Output
- example.assets.example_components.azureml_k_means_clustering(create_trainer_mode: example.assets.example_components._assets._AzuremlKMeansClusteringCreateTrainerModeEnum = _AzuremlKMeansClusteringCreateTrainerModeEnum.singleparameter, number_of_centroids: int = 2, initialization: example.assets.example_components._assets._AzuremlKMeansClusteringInitializationEnum = _AzuremlKMeansClusteringInitializationEnum.k_means, random_number_seed: Optional[int] = None, metric: example.assets.example_components._assets._AzuremlKMeansClusteringMetricEnum = _AzuremlKMeansClusteringMetricEnum.euclidean, should_input_instances_be_normalized: bool = True, iterations: int = 100, assign_label_mode: example.assets.example_components._assets._AzuremlKMeansClusteringAssignLabelModeEnum = _AzuremlKMeansClusteringAssignLabelModeEnum.ignore_label_column) example.assets.example_components._assets._AzuremlKMeansClusteringComponent
Initialize K-Means clustering model.
- Parameters
create_trainer_mode (_AzuremlKMeansClusteringCreateTrainerModeEnum) – Create advanced learner options (enum: [‘SingleParameter’])
number_of_centroids (int) – Number of Centroids (optional, min: 2)
initialization (_AzuremlKMeansClusteringInitializationEnum) – Initialization algorithm (optional, enum: [‘Random’, ‘K-Means++’, ‘Default’])
random_number_seed (int) – Type a value to seed the random number for centroid generator used by the training model. Leave blank to have value randomly choosen at first train. (optional, max: 4294967295)
metric (_AzuremlKMeansClusteringMetricEnum) – Selected metric (enum: [‘Euclidean’])
should_input_instances_be_normalized (bool) – Indicate whether instances should be normalized
iterations (int) – Number of iterations (min: 1)
assign_label_mode (_AzuremlKMeansClusteringAssignLabelModeEnum) – Mode of value assignment to the labeled column (enum: [‘Ignore label column’, ‘Fill missing values’, ‘Overwrite from closest to center’])
- Output untrained_model
Untrained K-Means clustering model
- Type
untrained_model: Output
- example.assets.example_components.azureml_latent_dirichlet_allocation(dataset: Optional[pathlib.Path] = None, target_columns: Optional[str] = None, number_of_topics_to_model: int = 5, n_grams: int = 2, normalize: bool = True, show_all_options: example.assets.example_components._assets._AzuremlLatentDirichletAllocationShowAllOptionsEnum = _AzuremlLatentDirichletAllocationShowAllOptionsEnum.false, rho_parameter: float = 0.01, alpha_parameter: float = 0.01, estimated_number_of_documents: int = 1000, size_of_the_batch: int = 32, initial_value_of_iteration_count: int = 10, power_applied_to_the_iteration_during_updates: float = 0.5, passes: int = 25, build_dictionary_of_ngrams_prior_to_lda: example.assets.example_components._assets._AzuremlLatentDirichletAllocationBuildDictionaryOfNgramsPriorToLdaEnum = _AzuremlLatentDirichletAllocationBuildDictionaryOfNgramsPriorToLdaEnum.true, maximum_number_of_ngrams_in_dictionary: int = 20000, hash_bits: int = 12, build_dictionary_of_ngrams: example.assets.example_components._assets._AzuremlLatentDirichletAllocationBuildDictionaryOfNgramsEnum = _AzuremlLatentDirichletAllocationBuildDictionaryOfNgramsEnum.true, maximum_size_of_ngram_dictionary: int = 20000, number_of_hash_bits: int = 12) example.assets.example_components._assets._AzuremlLatentDirichletAllocationComponent
Topic Modeling: Latent Dirichlet Allocation.
- Parameters
dataset (Path) – Input dataset
target_columns (str) – Target column name or index
number_of_topics_to_model (int) – Model the document distribution against N topics (min: 1, max: 1000)
n_grams (int) – Order of N-grams generated during hashing (min: 1, max: 10)
normalize (bool) – Normalize output to probabilities. The feature topic matrix will be P(word|topic).
show_all_options (_AzuremlLatentDirichletAllocationShowAllOptionsEnum) – Presents additional parameters specific to Skleaarn online LDA (enum: [‘True’, ‘False’])
rho_parameter (float) – Rho parameter (optional, min: 2.220446049250313e-16, max: 1.0)
alpha_parameter (float) – Alpha parameter (optional, min: 2.220446049250313e-16, max: 1.0)
estimated_number_of_documents (int) – Estimated number of documents (optional, min: 1, max: 2147483647)
size_of_the_batch (int) – Size of the batch (optional, min: 1, max: 1024)
initial_value_of_iteration_count (int) – Initial value of iteration count used in learning rate update schedule (optional, min: 1, max: 2147483647)
power_applied_to_the_iteration_during_updates (float) – Power applied to the iteration count during online updates (optional, min: 0.5, max: 1.0)
passes (int) – Number of training iterations (optional, min: 1, max: 1024)
build_dictionary_of_ngrams_prior_to_lda (_AzuremlLatentDirichletAllocationBuildDictionaryOfNgramsPriorToLdaEnum) – Builds a dictionary of ngrams prior to LDA. Useful for model inspection and interpretation (optional, enum: [‘True’, ‘False’])
maximum_number_of_ngrams_in_dictionary (int) – Maximum size of the dictionary. If number of tokens in the input exceed this size, collisions may occur (optional, min: 1, max: 2147483647)
hash_bits (int) – Number of bits to use for feature hashing (optional, min: 1, max: 31)
build_dictionary_of_ngrams (_AzuremlLatentDirichletAllocationBuildDictionaryOfNgramsEnum) – Builds a dictionary of ngrams prior to computing LDA. Useful for model inspection and interpretation (optional, enum: [‘True’, ‘False’])
maximum_size_of_ngram_dictionary (int) – Maximum size of the ngrams dictionary. If number of tokens in the input exceed this size, collisions may occur (optional, min: 1, max: 2147483647)
number_of_hash_bits (int) – Number of bits to use during feature hashing (optional, min: 1, max: 31)
- Output transformed_dataset
Output dataset
- Type
transformed_dataset: Output
- Output feature_topic_matrix
Feature topic matrix produced by LDA
- Type
feature_topic_matrix: Output
- Output lda_transformation
Transformation that applies LDA to the dataset
- Type
lda_transformation: Output
- example.assets.example_components.azureml_linear_regression(solution_method: example.assets.example_components._assets._AzuremlLinearRegressionSolutionMethodEnum = _AzuremlLinearRegressionSolutionMethodEnum.ordinary_least_squares, create_trainer_mode: example.assets.example_components._assets._AzuremlLinearRegressionCreateTrainerModeEnum = _AzuremlLinearRegressionCreateTrainerModeEnum.singleparameter, learning_rate: float = 0.1, number_of_epochs_over_which_algorithm_iterates_through_examples: int = 10, l2_regularization_term_weight: float = 0.001, range_for_learning_rate: str = '0.025; 0.05; 0.1; 0.2', range_for_number_of_epochs_over_which_algorithm_iterates_through_examples: str = '1; 10; 100', range_for_l2_regularization_term_weight: str = '0.001; 0.01; 0.1', should_input_instances_be_normalized: bool = True, decrease_learning_rate_as_iterations_progress: bool = True, l2_regularization_weight: float = 0.001, include_intercept_term: bool = True, random_number_seed: Optional[int] = None) example.assets.example_components._assets._AzuremlLinearRegressionComponent
Creates a linear regression model.
- Parameters
solution_method (_AzuremlLinearRegressionSolutionMethodEnum) – Choose an optimization method (enum: [‘Online Gradient Descent’, ‘Ordinary Least Squares’])
create_trainer_mode (_AzuremlLinearRegressionCreateTrainerModeEnum) – Create advanced learner options (optional, enum: [‘SingleParameter’, ‘ParameterRange’])
learning_rate (float) – Specify the initial learning rate for the stochastic gradient descent optimizer (optional, min: 2.220446049250313e-16)
number_of_epochs_over_which_algorithm_iterates_through_examples (int) – Specify how many times the algorithm should iterate through examples. For datasets with a small number of examples, this number should be large to reach convergence. (optional)
l2_regularization_term_weight (float) – Specify the weight for L2 regularization. Use a non-zero value to avoid overfitting. (optional)
range_for_learning_rate (str) – Specify the range for the initial learning rate for the stochastic gradient descent optimizer (optional)
range_for_number_of_epochs_over_which_algorithm_iterates_through_examples (str) – Specify range for how many times the algorithm should iterate through examples. For datasets with a small number of examples, this number should be large to reach convergence. (optional)
range_for_l2_regularization_term_weight (str) – Specify the range for the weight for L2 regularization. Use a non-zero value to avoid overfitting. (optional)
should_input_instances_be_normalized (bool) – Indicate whether instances should be normalized (optional)
decrease_learning_rate_as_iterations_progress (bool) – Indicate whether the learning rate should decrease as iterations progress (optional)
l2_regularization_weight (float) – Specify the weight for the L2 regularization. Use a non-zero value to avoid overfitting. (optional)
include_intercept_term (bool) – Indicate whether an additional term should be added for the intercept (optional)
random_number_seed (int) – Specify a value to seed the random number generator used by the model. Leave blank for default. (optional, max: 4294967295)
- Output untrained_model
An untrained regression model
- Type
untrained_model: Output
- example.assets.example_components.azureml_multiclass_boosted_decision_tree(create_trainer_mode: example.assets.example_components._assets._AzuremlMulticlassBoostedDecisionTreeCreateTrainerModeEnum = _AzuremlMulticlassBoostedDecisionTreeCreateTrainerModeEnum.singleparameter, maximum_number_of_leaves_per_tree: int = 20, minimum_number_of_training_instances_required_to_form_a_leaf: int = 10, the_learning_rate: float = 0.2, total_number_of_trees_constructed: int = 100, range_for_maximum_number_of_leaves_per_tree: str = '2; 8; 32; 128', range_for_minimum_number_of_training_instances_required_to_form_a_leaf: str = '1; 10; 50', range_for_learning_rate: str = '0.025; 0.05; 0.1; 0.2; 0.4', range_for_total_number_of_trees_constructed: str = '20; 100; 500', random_number_seed: Optional[int] = None) example.assets.example_components._assets._AzuremlMulticlassBoostedDecisionTreeComponent
Creates a multiclass classifier using a boosted decision tree algorithm.
- Parameters
create_trainer_mode (_AzuremlMulticlassBoostedDecisionTreeCreateTrainerModeEnum) – Create advanced learner options (enum: [‘SingleParameter’, ‘ParameterRange’])
maximum_number_of_leaves_per_tree (int) – Specify the maximum number of leaves allowed per tree (optional, min: 2, max: 131072)
minimum_number_of_training_instances_required_to_form_a_leaf (int) – Specify the minimum number of cases required to form a leaf (optional, min: 1)
the_learning_rate (float) – Specify the initial learning rate (optional, min: 2.220446049250313e-16, max: 1.0)
total_number_of_trees_constructed (int) – Specify the maximum number of trees that can be created during training (optional, min: 1)
range_for_maximum_number_of_leaves_per_tree (str) – Specify range for the maximum number of leaves allowed per tree (optional)
range_for_minimum_number_of_training_instances_required_to_form_a_leaf (str) – Specify the range for the minimum number of cases required to form a leaf (optional)
range_for_learning_rate (str) – Specify the range for the initial learning rate (optional)
range_for_total_number_of_trees_constructed (str) – Specify the range for the maximum number of trees that can be created during training (optional)
random_number_seed (int) – Type a value to seed the random number generator used by the model. Leave blank for default. (optional, max: 4294967295)
- Output untrained_model
An untrained multiclass classification model
- Type
untrained_model: Output
- example.assets.example_components.azureml_multiclass_decision_forest(create_trainer_mode: example.assets.example_components._assets._AzuremlMulticlassDecisionForestCreateTrainerModeEnum = _AzuremlMulticlassDecisionForestCreateTrainerModeEnum.singleparameter, number_of_decision_trees: int = 8, maximum_depth_of_the_decision_trees: int = 32, minimum_number_of_samples_per_leaf_node: int = 1, range_for_number_of_decision_trees: str = '1; 8; 32', range_for_the_maximum_depth_of_the_decision_trees: str = '1; 16; 64', range_for_the_minimum_number_of_samples_per_leaf_node: str = '1; 4; 16', resampling_method: example.assets.example_components._assets._AzuremlMulticlassDecisionForestResamplingMethodEnum = _AzuremlMulticlassDecisionForestResamplingMethodEnum.bagging_resampling) example.assets.example_components._assets._AzuremlMulticlassDecisionForestComponent
Creates a multiclass classification model using the decision forest algorithm.
- Parameters
create_trainer_mode (_AzuremlMulticlassDecisionForestCreateTrainerModeEnum) – Create advanced learner options (enum: [‘SingleParameter’, ‘ParameterRange’])
number_of_decision_trees (int) – Specify the number of decision trees to create in the ensemble (optional, min: 1)
maximum_depth_of_the_decision_trees (int) – Specify the maximum depth of any decision tree that can be created in the ensemble (optional, min: 1)
minimum_number_of_samples_per_leaf_node (int) – Specify the minimum number of training samples required to generate a leaf node (optional, min: 1)
range_for_number_of_decision_trees (str) – Specify range for the number of decision trees to create in the ensemble (optional)
range_for_the_maximum_depth_of_the_decision_trees (str) – Specify range for the maximum depth of the decision trees (optional)
range_for_the_minimum_number_of_samples_per_leaf_node (str) – Specify range for the minimum number of samples per leaf node (optional)
resampling_method (_AzuremlMulticlassDecisionForestResamplingMethodEnum) – Choose a resampling method (enum: [‘Bagging Resampling’, ‘Replicate Resampling’])
- Output untrained_model
An untrained multiclass classification model
- Type
untrained_model: Output
- example.assets.example_components.azureml_multiclass_logistic_regression(create_trainer_mode: example.assets.example_components._assets._AzuremlMulticlassLogisticRegressionCreateTrainerModeEnum = _AzuremlMulticlassLogisticRegressionCreateTrainerModeEnum.singleparameter, optimization_tolerance: float = 1e-07, l2_regularizaton_weight: float = 1.0, range_for_optimization_tolerance: str = '0.00001; 0.00000001', range_for_l2_regularization_weight: str = '0.01; 0.1; 1.0', random_number_seed: Optional[int] = None) example.assets.example_components._assets._AzuremlMulticlassLogisticRegressionComponent
Creates a multiclass logistic regression classification model.
- Parameters
create_trainer_mode (_AzuremlMulticlassLogisticRegressionCreateTrainerModeEnum) – Create advanced learner options (enum: [‘SingleParameter’, ‘ParameterRange’])
optimization_tolerance (float) – Specify a tolerance value for the L-BFGS optimizer (optional, min: 2.220446049250313e-16)
l2_regularizaton_weight (float) – Specify the L2 regularization weight. Use a non-zero value to avoid overfitting. (optional)
range_for_optimization_tolerance (str) – Specify a range for the tolerance value for the L-BFGS optimizer (optional)
range_for_l2_regularization_weight (str) – Specify the range for the L2 regularization weight. Use a non-zero value to avoid overfitting. (optional)
random_number_seed (int) – Type a value to seed the random number generator used by the model. Leave blank for default. (optional, max: 4294967295)
- Output untrained_model
An untrained classificaiton model
- Type
untrained_model: Output
- example.assets.example_components.azureml_multiclass_neural_network(create_trainer_mode: example.assets.example_components._assets._AzuremlMulticlassNeuralNetworkCreateTrainerModeEnum = _AzuremlMulticlassNeuralNetworkCreateTrainerModeEnum.singleparameter, hidden_layer_specification: example.assets.example_components._assets._AzuremlMulticlassNeuralNetworkHiddenLayerSpecificationEnum = _AzuremlMulticlassNeuralNetworkHiddenLayerSpecificationEnum.fully_connected_case, number_of_hidden_nodes: str = '100', the_learning_rate: float = 0.1, number_of_learning_iterations: int = 100, hidden_layer_specification1: example.assets.example_components._assets._AzuremlMulticlassNeuralNetworkHiddenLayerSpecification1Enum = _AzuremlMulticlassNeuralNetworkHiddenLayerSpecification1Enum.fully_connected_case, number_of_hidden_nodes1: str = '100', range_for_learning_rate: str = '0.1; 0.2; 0.4', range_for_number_of_learning_iterations: str = '20; 40; 80; 160', the_momentum: float = 0, shuffle_examples: bool = True, random_number_seed: Optional[int] = None) example.assets.example_components._assets._AzuremlMulticlassNeuralNetworkComponent
Creates a multiclass classification model using a neural network algorithm.
- Parameters
create_trainer_mode (_AzuremlMulticlassNeuralNetworkCreateTrainerModeEnum) – Create advanced learner options (enum: [‘SingleParameter’, ‘ParameterRange’])
hidden_layer_specification (_AzuremlMulticlassNeuralNetworkHiddenLayerSpecificationEnum) – Specify the architecture of the hidden layer or layers (optional, enum: [‘Fully-connected case’])
number_of_hidden_nodes (str) – Type the number of nodes in the hidden layer. For multiple hidden layers, type a comma-separated list. (optional)
the_learning_rate (float) – Specify the size of each step in the learning process (optional, min: 2.220446049250313e-16, max: 2.0)
number_of_learning_iterations (int) – Specify the number of iterations while learning (optional, min: 1)
hidden_layer_specification1 (_AzuremlMulticlassNeuralNetworkHiddenLayerSpecification1Enum) – Specify the architecture of the hidden layer or layers for range (optional, enum: [‘Fully-connected case’])
number_of_hidden_nodes1 (str) – Type the number of nodes in the hidden layer, or for multiple hidden layers, type a comma-separated list. (optional)
range_for_learning_rate (str) – Specify the range for the size of each step in the learning process (optional)
range_for_number_of_learning_iterations (str) – Specify the range for the number of iterations while learning (optional)
the_momentum (float) – Specify a weight to apply during learning to nodes from previous iterations (max: 1.0)
shuffle_examples (bool) – Select this option to change the order of instances between learning iterations
random_number_seed (int) – Specify a numeric seed to use for random number generation. Leave blank to use the default seed. (optional, max: 4294967295)
- Output untrained_model
An untrained multiclass classification model
- Type
untrained_model: Output
- example.assets.example_components.azureml_neural_network_regression(create_trainer_mode: example.assets.example_components._assets._AzuremlNeuralNetworkRegressionCreateTrainerModeEnum = _AzuremlNeuralNetworkRegressionCreateTrainerModeEnum.singleparameter, hidden_layer_specification: example.assets.example_components._assets._AzuremlNeuralNetworkRegressionHiddenLayerSpecificationEnum = _AzuremlNeuralNetworkRegressionHiddenLayerSpecificationEnum.fully_connected_case, number_of_hidden_nodes: str = '100', the_learning_rate: float = 0.1, number_of_learning_iterations: int = 100, hidden_layer_specification1: example.assets.example_components._assets._AzuremlNeuralNetworkRegressionHiddenLayerSpecification1Enum = _AzuremlNeuralNetworkRegressionHiddenLayerSpecification1Enum.fully_connected_case, number_of_hidden_nodes1: str = '100', range_for_learning_rate: str = '0.1; 0.2; 0.4', range_for_number_of_learning_iterations: str = '20; 40; 80; 160', the_momentum: float = 0, shuffle_examples: bool = True, random_number_seed: Optional[int] = None) example.assets.example_components._assets._AzuremlNeuralNetworkRegressionComponent
Creates a regression model using a neural network algorithm.
- Parameters
create_trainer_mode (_AzuremlNeuralNetworkRegressionCreateTrainerModeEnum) – Create advanced learner options (enum: [‘SingleParameter’, ‘ParameterRange’])
hidden_layer_specification (_AzuremlNeuralNetworkRegressionHiddenLayerSpecificationEnum) – Specify the architecture of the hidden layer or layers (optional, enum: [‘Fully-connected case’])
number_of_hidden_nodes (str) – Type the number of nodes in the hidden layer. For multiple hidden layers, type a comma-separated list. (optional)
the_learning_rate (float) – Specify the size of each step in the learning process (optional, min: 2.220446049250313e-16, max: 2.0)
number_of_learning_iterations (int) – Specify the number of iterations while learning (optional, min: 1)
hidden_layer_specification1 (_AzuremlNeuralNetworkRegressionHiddenLayerSpecification1Enum) – Specify the architecture of the hidden layer or layers for range (optional, enum: [‘Fully-connected case’])
number_of_hidden_nodes1 (str) – Type the number of nodes in the hidden layer, or for multiple hidden layers, type a comma-separated list. (optional)
range_for_learning_rate (str) – Specify the range for the size of each step in the learning process (optional)
range_for_number_of_learning_iterations (str) – Specify the range for the number of iterations while learning (optional)
the_momentum (float) – Specify a weight to apply during learning to nodes from previous iterations (max: 1.0)
shuffle_examples (bool) – Select this option to change the order of instances between learning iterations
random_number_seed (int) – Specify a numeric seed to use for random number generation. Leave blank to use the default seed. (optional, max: 4294967295)
- Output untrained_model
An untrained regression model
- Type
untrained_model: Output
- example.assets.example_components.azureml_normalize_data(dataset: Optional[pathlib.Path] = None, transformation_method: example.assets.example_components._assets._AzuremlNormalizeDataTransformationMethodEnum = _AzuremlNormalizeDataTransformationMethodEnum.zscore, use_0_for_constant_columns_when_checked: bool = True, columns_to_transform: Optional[str] = None) example.assets.example_components._assets._AzuremlNormalizeDataComponent
Rescales numeric data to constrain dataset values to a standard range.
- Parameters
dataset (Path) – Input dataset
transformation_method (_AzuremlNormalizeDataTransformationMethodEnum) – Choose the mathematical method used for scaling (enum: [‘ZScore’, ‘MinMax’, ‘Logistic’, ‘LogNormal’, ‘Tanh’])
use_0_for_constant_columns_when_checked (bool) – Use NaN for constant columns when unchecked or 0 when checked (optional)
columns_to_transform (str) – Select all columns to which the selected transformation should be applied
- Output transformed_dataset
Transformed dataset
- Type
transformed_dataset: Output
- Output transformation_function
Definition of the transformation function, which can be applied to other datasets
- Type
transformation_function: Output
- example.assets.example_components.azureml_one_vs_all_multiclass(untrained_binary_classification_model: Optional[pathlib.Path] = None) example.assets.example_components._assets._AzuremlOneVsAllMulticlassComponent
Creates a one-vs-all multiclass classification model from an ensemble of binary classification models.
- Parameters
untrained_binary_classification_model (Path) – An untrained binary classification model
- Output untrained_model
An untrained multi-class classification
- Type
untrained_model: Output
- example.assets.example_components.azureml_one_vs_one_multiclass(untrained_binary_classification_model: Optional[pathlib.Path] = None) example.assets.example_components._assets._AzuremlOneVsOneMulticlassComponent
Creates a one-vs-one multiclass classification model from an ensemble of binary classification models.
- Parameters
untrained_binary_classification_model (Path) – An untrained binary classification model
- Output untrained_model
An untrained multi-class classification
- Type
untrained_model: Output
- example.assets.example_components.azureml_partition_and_sample(dataset: Optional[pathlib.Path] = None, partition_or_sample_mode: example.assets.example_components._assets._AzuremlPartitionAndSamplePartitionOrSampleModeEnum = _AzuremlPartitionAndSamplePartitionOrSampleModeEnum.sampling, use_replacement_in_the_partitioning: bool = False, randomized_split: bool = True, random_seed: int = 0, specify_the_partitioner_method: example.assets.example_components._assets._AzuremlPartitionAndSampleSpecifyThePartitionerMethodEnum = _AzuremlPartitionAndSampleSpecifyThePartitionerMethodEnum.partition_evenly, specify_how_many_folds_do_you_want_to_split_evenly_into: int = 5, stratified_split: example.assets.example_components._assets._AzuremlPartitionAndSampleStratifiedSplitEnum = _AzuremlPartitionAndSampleStratifiedSplitEnum.false, stratification_key_column: Optional[str] = None, proportion_list_of_customized_folds_separated_by_comma: Optional[str] = None, stratified_split_for_customized_fold_assignment: example.assets.example_components._assets._AzuremlPartitionAndSampleStratifiedSplitForCustomizedFoldAssignmentEnum = _AzuremlPartitionAndSampleStratifiedSplitForCustomizedFoldAssignmentEnum.false, stratification_key_column_for_customized_fold_assignment: Optional[str] = None, specify_which_fold_to_be_sampled_from: int = 1, pick_complement_of_the_selected_fold: bool = False, rate_of_sampling: float = 0.01, random_seed_for_sampling: int = 0, stratified_split_for_sampling: example.assets.example_components._assets._AzuremlPartitionAndSampleStratifiedSplitForSamplingEnum = _AzuremlPartitionAndSampleStratifiedSplitForSamplingEnum.false, stratification_key_column_for_sampling: Optional[str] = None, number_of_rows_to_select: int = 10) example.assets.example_components._assets._AzuremlPartitionAndSampleComponent
Creates multiple partitions of a dataset based on sampling.
- Parameters
dataset (Path) – Dataset to be split
partition_or_sample_mode (_AzuremlPartitionAndSamplePartitionOrSampleModeEnum) – Select the partition or sampling mode (enum: [‘Assign to Folds’, ‘Pick Fold’, ‘Sampling’, ‘Head’])
use_replacement_in_the_partitioning (bool) – Indicate whether the dataset should be replaced when split, or split without replacement (optional)
randomized_split (bool) – Indicates whether split is random or not (optional)
random_seed (int) – Specify a seed for the random number generator (optional, max: 4294967295)
specify_the_partitioner_method (_AzuremlPartitionAndSampleSpecifyThePartitionerMethodEnum) – EvenSize where you specify number of folds, or ShapeInPct where you specify a list of percentage numbers (optional, enum: [‘Partition evenly’, ‘Partition with customized proportions’])
specify_how_many_folds_do_you_want_to_split_evenly_into (int) – Number of even partitions to be evenly split into (optional, min: 1)
stratified_split (_AzuremlPartitionAndSampleStratifiedSplitEnum) – Indicates whether the split is stratified or not (optional, enum: [‘True’, ‘False’])
stratification_key_column (str) – Column containing stratification key (optional)
proportion_list_of_customized_folds_separated_by_comma (str) – List of proportions separated by comma (optional)
stratified_split_for_customized_fold_assignment (_AzuremlPartitionAndSampleStratifiedSplitForCustomizedFoldAssignmentEnum) – Indicates whether the split is stratified or not for customized fold assignments (optional, enum: [‘True’, ‘False’])
stratification_key_column_for_customized_fold_assignment (str) – Column containing stratification key for customized fold assignments (optional)
specify_which_fold_to_be_sampled_from (int) – Index of the partitioned fold to be sampled from (optional, min: 1)
pick_complement_of_the_selected_fold (bool) – Complement of the logic fold (optional)
rate_of_sampling (float) – Sampling rate (optional)
random_seed_for_sampling (int) – Random number generator seed for sampling (optional, max: 4294967295)
stratified_split_for_sampling (_AzuremlPartitionAndSampleStratifiedSplitForSamplingEnum) – Indicates whether the split is stratified or not for sampling (optional, enum: [‘True’, ‘False’])
stratification_key_column_for_sampling (str) – Column containing stratification key for sampling (optional)
number_of_rows_to_select (int) – Maximum number of records that will be allowed to pass through to the next module (optional)
- Output odataset
Dataset resulting from the split
- Type
odataset: Output
- example.assets.example_components.azureml_pca_based_anomaly_detection(training_mode: example.assets.example_components._assets._AzuremlPcaBasedAnomalyDetectionTrainingModeEnum = _AzuremlPcaBasedAnomalyDetectionTrainingModeEnum.singleparameter, number_of_components_to_use_in_pca: int = 2, oversampling_parameter_for_randomized_pca: int = 2, enable_input_feature_mean_normalization: bool = False) example.assets.example_components._assets._AzuremlPcaBasedAnomalyDetectionComponent
Create a PCA-based anomaly detection model.
- Parameters
training_mode (_AzuremlPcaBasedAnomalyDetectionTrainingModeEnum) – Specify learner options. Use ‘SingleParameter’ to manually specify all values. Use ‘ParameterRange’ to sweep over tunable parameters. (enum: [‘SingleParameter’])
number_of_components_to_use_in_pca (int) – Specify the number of components to use in PCA. (optional, min: 1)
oversampling_parameter_for_randomized_pca (int) – Specify the accuracy parameter for randomized PCA training. (optional)
enable_input_feature_mean_normalization (bool) – Specify if the input data is normalized to have zero mean.
- Output untrained_model
An untrained PCA-based anomaly detection model.
- Type
untrained_model: Output
- example.assets.example_components.azureml_permutation_feature_importance(trained_model: Optional[pathlib.Path] = None, test_data: Optional[pathlib.Path] = None, random_seed: int = 0, metric_for_measuring_performance: example.assets.example_components._assets._AzuremlPermutationFeatureImportanceMetricForMeasuringPerformanceEnum = _AzuremlPermutationFeatureImportanceMetricForMeasuringPerformanceEnum.accuracy) example.assets.example_components._assets._AzuremlPermutationFeatureImportanceComponent
Computes the permutation feature importance scores of feature variables given a trained model and a test dataset.
- Parameters
trained_model (Path) – Trained model to be used for scoring
test_data (Path) – Test dataset for scoring and evaluating a model after permutation of feature values
random_seed (int) – Random number generator seed value (max: 4294967295)
metric_for_measuring_performance (_AzuremlPermutationFeatureImportanceMetricForMeasuringPerformanceEnum) – Evaluation metric (enum: [‘Accuracy’, ‘Precision’, ‘Recall’, ‘Mean Absolute Error’, ‘Root Mean Squared Error’, ‘Relative Absolute Error’, ‘Relative Squared Error’, ‘Coefficient of Determination’])
- Output feature_importance
Feature importance results
- Type
feature_importance: Output
- example.assets.example_components.azureml_poisson_regression(create_trainer_mode: example.assets.example_components._assets._AzuremlPoissonRegressionCreateTrainerModeEnum = _AzuremlPoissonRegressionCreateTrainerModeEnum.singleparameter, tolerance_parameter_for_optimization_convergence_the_lower_the_value_the_slower_and_more_accurate_the_fitting: float = 1e-07, l1_regularization_weight: float = 1.0, l2_regularization_weight: float = 1.0, memory_size_for_l_bfgs_the_lower_the_value_the_faster_and_less_accurate_the_training: int = 20, range_for_optimization_tolerance: str = '0.00001; 0.00000001', range_for_l1_regularization_weight: str = '0.0; 0.01; 0.1; 1.0', range_for_l2_regularization_weight: str = '0.01; 0.1; 1.0', range_for_memory_size_for_l_bfgs_the_lower_the_value_the_faster_and_less_accurate_the_training: str = '5; 20; 50') example.assets.example_components._assets._AzuremlPoissonRegressionComponent
Creates a regression model that assumes data has a Poisson distribution
- Parameters
create_trainer_mode (_AzuremlPoissonRegressionCreateTrainerModeEnum) – Create advanced learner options (enum: [‘SingleParameter’, ‘ParameterRange’])
tolerance_parameter_for_optimization_convergence_the_lower_the_value_the_slower_and_more_accurate_the_fitting (float) – Specify a tolerance value for optimization convergence. The lower the value, the slower and more accurate the fitting. (optional, min: 2.220446049250313e-16)
l1_regularization_weight (float) – Specify the L1 regularization weight. Use a non-zero value to avoid overfitting the model. (optional)
l2_regularization_weight (float) – Specify the L2 regularization weight. Use a non-zero value to avoid overfitting the model. (optional)
memory_size_for_l_bfgs_the_lower_the_value_the_faster_and_less_accurate_the_training (int) – Indicate how much memory (in MB) to use for the L-BFGS optimizer. With less memory, training is faster but less accurate the training. (optional, min: 1)
range_for_optimization_tolerance (str) – Specify a range for the tolerance value for the L-BFGS optimizer (optional)
range_for_l1_regularization_weight (str) – Specify the range for the L1 regularization weight. Use a non-zero value to avoid overfitting. (optional)
range_for_l2_regularization_weight (str) – Specify the range for the L2 regularization weight. Use a non-zero value to avoid overfitting. (optional)
range_for_memory_size_for_l_bfgs_the_lower_the_value_the_faster_and_less_accurate_the_training (str) – Specify the range for the amount of memory (in MB) to use for the L-BFGS optimizer. The lower the value, the faster and less accurate the training. (optional)
- Output untrained_model
An untrained Poisson regression model
- Type
untrained_model: Output
- example.assets.example_components.azureml_preprocess_text(dataset: Optional[pathlib.Path] = None, stop_words: Optional[pathlib.Path] = None, language: example.assets.example_components._assets._AzuremlPreprocessTextLanguageEnum = _AzuremlPreprocessTextLanguageEnum.english, expand_verb_contractions: bool = True, text_column_to_clean: Optional[str] = None, remove_stop_words: bool = True, use_lemmatization: bool = True, detect_sentences: bool = True, normalize_case_to_lowercase: bool = True, remove_numbers: bool = True, remove_special_characters: bool = True, remove_duplicate_characters: bool = True, remove_email_addresses: bool = True, remove_urls: bool = True, normalize_backslashes_to_slashes: bool = True, split_tokens_on_special_characters: bool = True, custom_regular_expression: Optional[str] = None, custom_replacement_string: Optional[str] = None) example.assets.example_components._assets._AzuremlPreprocessTextComponent
Performs cleaning operations on text.
- Parameters
dataset (Path) – Input data
stop_words (Path) – Optional custom list of stop words to remove(optional)
language (_AzuremlPreprocessTextLanguageEnum) – Select the language to preprocess (enum: [‘English’])
expand_verb_contractions (bool) – Expand verb contractions (English only) (optional)
text_column_to_clean (str) – Select the text column to clean
remove_stop_words (bool) – Remove stop words
use_lemmatization (bool) – Use lemmatization
detect_sentences (bool) – Detect sentences by adding a sentence terminator “|||” that can be used by the n-gram features extractor module
normalize_case_to_lowercase (bool) – Normalize case to lowercase
remove_numbers (bool) – Remove numbers
remove_special_characters (bool) – Remove non-alphanumeric special characters and replace them with “|” character
remove_duplicate_characters (bool) – Remove duplicate characters
remove_email_addresses (bool) – Remove email addresses
remove_urls (bool) – Remove URLs
normalize_backslashes_to_slashes (bool) – Normalize backslashes to slashes
split_tokens_on_special_characters (bool) – Split tokens on special characters
custom_regular_expression (str) – Specify the custom regular expression (optional)
custom_replacement_string (str) – Specify the custom replacement string for the custom regular expression (optional)
- Output results_dataset
Results dataset
- Type
results_dataset: Output
- example.assets.example_components.azureml_remove_duplicate_rows(dataset: Optional[pathlib.Path] = None, key_column_selection_filter_expression: Optional[str] = None, retain_first_duplicate_row: bool = True) example.assets.example_components._assets._AzuremlRemoveDuplicateRowsComponent
Removes the duplicate rows from a dataset.
- Parameters
dataset (Path) – Input dataset
key_column_selection_filter_expression (str) – Choose the key columns to use when searching for duplicates
retain_first_duplicate_row (bool) – indicate whether to keep the first row of a set of duplicates and discard others. if false, the last duplicate row encountered will be kept.
- Output results_dataset
Filtered dataset
- Type
results_dataset: Output
- example.assets.example_components.azureml_resnet(model_name: example.assets.example_components._assets._AzuremlResnetModelNameEnum = _AzuremlResnetModelNameEnum.resnext101_32x8d, pretrained: bool = True, zero_init_residual: bool = False) example.assets.example_components._assets._AzuremlResnetComponent
Creates a image classification model using the resnet algorithm.
- Parameters
model_name (_AzuremlResnetModelNameEnum) – Name of a certain resnet structure (enum: [‘resnet18’, ‘resnet34’, ‘resnet50’, ‘resnet101’, ‘resnet152’, ‘resnext50_32x4d’, ‘resnext101_32x8d’, ‘wide_resnet50_2’, ‘wide_resnet101_2’])
pretrained (bool) – Indicate whether to use a model pre-trained on ImageNet
zero_init_residual (bool) – Zero-initialize the last BN in each residual branch. (optional)
- Output untrained_model
Untrained resnet model path
- Type
untrained_model: Output
- example.assets.example_components.azureml_score_image_model(trained_model: Optional[pathlib.Path] = None, dataset: Optional[pathlib.Path] = None) example.assets.example_components._assets._AzuremlScoreImageModelComponent
Scores predictions for a trained image model.
- Parameters
trained_model (Path) – Trained predictive model
dataset (Path) – Input data to score
- Output scored_dataset
Dataset with obtained scores
- Type
scored_dataset: Output
- example.assets.example_components.azureml_score_model(trained_model: Optional[pathlib.Path] = None, dataset: Optional[pathlib.Path] = None, append_score_columns_to_output: bool = True) example.assets.example_components._assets._AzuremlScoreModelComponent
Scores predictions for a trained classification or regression model.
- Parameters
trained_model (Path) – Trained predictive model
dataset (Path) – Input test dataset
append_score_columns_to_output (bool) – If checked, append score columns to the result dataset, otherwise only return the scores and true labels if available.
- Output scored_dataset
Dataset with obtained scores
- Type
scored_dataset: Output
- example.assets.example_components.azureml_score_svd_recommender(trained_svd_recommendation: Optional[pathlib.Path] = None, dataset_to_score: Optional[pathlib.Path] = None, training_data: Optional[pathlib.Path] = None, recommender_prediction_kind: example.assets.example_components._assets._AzuremlScoreSvdRecommenderRecommenderPredictionKindEnum = _AzuremlScoreSvdRecommenderRecommenderPredictionKindEnum.item_recommendation, recommended_item_selection: example.assets.example_components._assets._AzuremlScoreSvdRecommenderRecommendedItemSelectionEnum = _AzuremlScoreSvdRecommenderRecommendedItemSelectionEnum.from_rated_items_for_model_evaluation, minimum_size_of_the_recommendation_pool_for_a_single_user: int = 2, maximum_number_of_items_to_recommend_to_a_user: int = 5, whether_to_return_the_predicted_ratings_of_the_items_along_with_the_labels: bool = False) example.assets.example_components._assets._AzuremlScoreSvdRecommenderComponent
Score a dataset using the SVD recommendation.
- Parameters
trained_svd_recommendation (Path) – Trained SVD recommendation
dataset_to_score (Path) – Dataset to score
training_data (Path) – Dataset containing the training data. (Used to filter out already rated items from prediction)(optional)
recommender_prediction_kind (_AzuremlScoreSvdRecommenderRecommenderPredictionKindEnum) – Specify the type of prediction the recommendation should output (enum: [‘Rating Prediction’, ‘Item Recommendation’])
recommended_item_selection (_AzuremlScoreSvdRecommenderRecommendedItemSelectionEnum) – Select the set of items to make recommendations from (optional, enum: [‘From All Items’, ‘From Rated Items (for model evaluation)’, ‘From Unrated Items (to suggest new items to users)’])
minimum_size_of_the_recommendation_pool_for_a_single_user (int) – Specify the minimum size of the recommendation pool for each user (optional, min: 1)
maximum_number_of_items_to_recommend_to_a_user (int) – Specify the maximum number of items to recommend to a user (optional, min: 1)
whether_to_return_the_predicted_ratings_of_the_items_along_with_the_labels (bool) – Specify whether to return the predicted ratings of the items along with the labels (optional)
- Output scored_dataset
Scored dataset
- Type
scored_dataset: Output
- example.assets.example_components.azureml_score_vowpal_wabbit_model(trained_vowpal_wabbit_model: Optional[pathlib.Path] = None, test_data: Optional[pathlib.Path] = None, vw_arguments: Optional[str] = None, name_of_the_test_data_file: Optional[str] = None, specify_file_type: example.assets.example_components._assets._AzuremlScoreVowpalWabbitModelSpecifyFileTypeEnum = _AzuremlScoreVowpalWabbitModelSpecifyFileTypeEnum.vw, include_an_extra_column_containing_labels: bool = False, include_an_extra_column_containing_raw_scores: bool = False) example.assets.example_components._assets._AzuremlScoreVowpalWabbitModelComponent
Score data using Vowpal Wabbit from the command line interface.
- Parameters
trained_vowpal_wabbit_model (Path) – Trained Vowpal Wabbit model.
test_data (Path) – Test data.
vw_arguments (str) – Type vowpal wabbit command line arguments. (optional)
name_of_the_test_data_file (str) – Type name of the test data file. (optional)
specify_file_type (_AzuremlScoreVowpalWabbitModelSpecifyFileTypeEnum) – Please specify file type. (enum: [‘VW’, ‘SVMLight’])
include_an_extra_column_containing_labels (bool) – Whether to include an extra column containing labels in the scored dataset.
include_an_extra_column_containing_raw_scores (bool) – Whether to include an extra column containing raw scores in the scored dataset.
- Output scored_dataset
Scored dataset
- Type
scored_dataset: Output
- example.assets.example_components.azureml_score_wide_and_deep_recommender(trained_wide_and_deep_recommendation_model: Optional[pathlib.Path] = None, dataset_to_score: Optional[pathlib.Path] = None, user_features: Optional[pathlib.Path] = None, item_features: Optional[pathlib.Path] = None, training_data: Optional[pathlib.Path] = None, recommender_prediction_kind: example.assets.example_components._assets._AzuremlScoreWideAndDeepRecommenderRecommenderPredictionKindEnum = _AzuremlScoreWideAndDeepRecommenderRecommenderPredictionKindEnum.item_recommendation, recommended_item_selection: example.assets.example_components._assets._AzuremlScoreWideAndDeepRecommenderRecommendedItemSelectionEnum = _AzuremlScoreWideAndDeepRecommenderRecommendedItemSelectionEnum.from_rated_items_for_model_evaluation, minimum_size_of_the_recommendation_pool_for_a_single_user: int = 2, maximum_number_of_items_to_recommend_to_a_user: int = 5, whether_to_return_the_predicted_ratings_of_the_items_along_with_the_labels: bool = False) example.assets.example_components._assets._AzuremlScoreWideAndDeepRecommenderComponent
Score a dataset using the Wide and Deep recommendation model.
- Parameters
trained_wide_and_deep_recommendation_model (Path) – Trained Wide and Deep recommendation model
dataset_to_score (Path) – Dataset to score
user_features (Path) – User features(optional)
item_features (Path) – Item features(optional)
training_data (Path) – Dataset containing the training data. (Used to filter out already rated items from prediction)(optional)
recommender_prediction_kind (_AzuremlScoreWideAndDeepRecommenderRecommenderPredictionKindEnum) – Specify the type of prediction the recommendation should output (enum: [‘Rating Prediction’, ‘Item Recommendation’])
recommended_item_selection (_AzuremlScoreWideAndDeepRecommenderRecommendedItemSelectionEnum) – Select the set of items to make recommendations from (optional, enum: [‘From All Items’, ‘From Rated Items (for model evaluation)’, ‘From Unrated Items (to suggest new items to users)’])
minimum_size_of_the_recommendation_pool_for_a_single_user (int) – Specify the minimum size of the recommendation pool for each user (optional, min: 1)
maximum_number_of_items_to_recommend_to_a_user (int) – Specify the maximum number of items to recommend to a user (optional, min: 1)
whether_to_return_the_predicted_ratings_of_the_items_along_with_the_labels (bool) – Specify whether to return the predicted ratings of the items along with the labels (optional)
- Output scored_dataset
Scored dataset
- Type
scored_dataset: Output
- example.assets.example_components.azureml_select_columns_in_dataset(dataset: Optional[pathlib.Path] = None, select_columns: Optional[str] = None) example.assets.example_components._assets._AzuremlSelectColumnsInDatasetComponent
Selects columns to include or exclude from a dataset in an operation.
- Parameters
dataset (Path) – Input dataset
select_columns (str) – Select columns to keep in the projected dataset
- Output results_dataset
Output dataset
- Type
results_dataset: Output
- example.assets.example_components.azureml_select_columns_transform(dataset_with_desired_columns: Optional[pathlib.Path] = None) example.assets.example_components._assets._AzuremlSelectColumnsTransformComponent
Create a transformation that selects the same subset of columns as in the given dataset.
- Parameters
dataset_with_desired_columns (Path) – Dataset containing desired set of columns
- Output columns_selection_transformation
Transformation that selects the same subset of columns as in the given dataset.
- Type
columns_selection_transformation: Output
- example.assets.example_components.azureml_smote(samples: Optional[pathlib.Path] = None, label_column: Optional[str] = None, smote_percentage: int = 100, number_of_nearest_neighbors: int = 1, random_seed: int = 0) example.assets.example_components._assets._AzuremlSmoteComponent
Increases the number of low incidence examples in a dataset.
- Parameters
samples (Path) – A DataTable of samples
label_column (str) – Select the column that contains the label or outcome column
smote_percentage (int) – Amount of oversampling.If not in integral multiples of 100, the minority class will be randomized and downsampled from the next integral multiple of 100.
number_of_nearest_neighbors (int) – The number of nearest neighbors (min: 1)
random_seed (int) – Random number generator seed (max: 4294967295)
- Output table
A DataTable containing original samples plus an additional synthetic minority class samples, where T is the number of minority class samples
- Type
table: Output
- example.assets.example_components.azureml_split_data(dataset: Optional[pathlib.Path] = None, splitting_mode: example.assets.example_components._assets._AzuremlSplitDataSplittingModeEnum = _AzuremlSplitDataSplittingModeEnum.split_rows, fraction_of_rows_in_the_first_output_dataset: float = 0.5, randomized_split: bool = True, random_seed: int = 0, stratified_split: example.assets.example_components._assets._AzuremlSplitDataStratifiedSplitEnum = _AzuremlSplitDataStratifiedSplitEnum.false, stratification_key_column: Optional[str] = None, regular_expression: str = '"column name" ^start', relational_expression: str = '"column name" > 3') example.assets.example_components._assets._AzuremlSplitDataComponent
Partitions the rows of a dataset into two distinct sets.
- Parameters
dataset (Path) – Dataset to split
splitting_mode (_AzuremlSplitDataSplittingModeEnum) – Choose the method for splitting the dataset (enum: [‘Split Rows’, ‘Regular Expression’, ‘Relative Expression’])
fraction_of_rows_in_the_first_output_dataset (float) – Specify a ratio representing the number of rows in the first output dataset over the number of rows in the input dataset (optional, max: 1.0)
randomized_split (bool) – Indicate whether rows should be randomly selected (optional)
random_seed (int) – Provide a value to see the random number generator seed (optional, max: 4294967295)
stratified_split (_AzuremlSplitDataStratifiedSplitEnum) – Indicate whether the rows in each split should be grouped using a strata column (optional, enum: [‘True’, ‘False’])
stratification_key_column (str) – Select the column containing the stratification key (optional)
regular_expression (str) – Type a regular expression to use as criteria when splitting the dataset on a string column (optional)
relational_expression (str) – Type a relational expression to use in splitting the dataset on a numeric column (optional)
- Output results_dataset1
Dataset containing selected rows
- Type
results_dataset1: Output
- Output results_dataset2
Dataset containing all other rows
- Type
results_dataset2: Output
- example.assets.example_components.azureml_split_image_directory(input_image_directory: Optional[pathlib.Path] = None, fraction_of_images_in_the_first_output: float = 0.9) example.assets.example_components._assets._AzuremlSplitImageDirectoryComponent
Partitions the images of a image directory into two distinct sets.
- Parameters
input_image_directory (Path) – Input image directory
fraction_of_images_in_the_first_output (float) – Fraction of images in the first output (min: 2.220446049250313e-16, max: 0.9999999999999998)
- Output output_image_directory1
First output image directory
- Type
output_image_directory1: Output
- Output output_image_directory2
Second output image directory
- Type
output_image_directory2: Output
- example.assets.example_components.azureml_summarize_data(input: Optional[pathlib.Path] = None) example.assets.example_components._assets._AzuremlSummarizeDataComponent
Generates a basic descriptive statistics report for the columns in a dataset.
- Parameters
input (Path) – DataFrameDirectory
- Output result_dataset
DataFrameDirectory
- Type
result_dataset: Output
- example.assets.example_components.azureml_train_anomaly_detection_model(untrained_model: Optional[pathlib.Path] = None, dataset: Optional[pathlib.Path] = None) example.assets.example_components._assets._AzuremlTrainAnomalyDetectionModelComponent
Trains an anomaly detector model and labels data from the training set
- Parameters
untrained_model (Path) – Untrained learner
dataset (Path) – Input data source
- Output trained_model
Trained anomaly detection model
- Type
trained_model: Output
- example.assets.example_components.azureml_train_clustering_model(untrained_model: Optional[pathlib.Path] = None, dataset: Optional[pathlib.Path] = None, column_set: Optional[str] = None, check_for_append_or_uncheck_for_result_only: bool = True) example.assets.example_components._assets._AzuremlTrainClusteringModelComponent
Train clustering model and assign data to clusters.
- Parameters
untrained_model (Path) – Untrained clustering model
dataset (Path) – Input data source
column_set (str) – Column selection pattern
check_for_append_or_uncheck_for_result_only (bool) – Whether output dataset must contain input dataset appended by assignments column (Checked) or assignments column only (Unchecked)
- Output trained_model
Trained clustering model
- Type
trained_model: Output
- Output results_dataset
Input dataset appended by data column of assignments or assignments column only
- Type
results_dataset: Output
- example.assets.example_components.azureml_train_model(untrained_model: Optional[pathlib.Path] = None, dataset: Optional[pathlib.Path] = None, label_column: Optional[str] = None, model_explanations: bool = False) example.assets.example_components._assets._AzuremlTrainModelComponent
Trains a classification or regression model in a supervised manner.
- Parameters
untrained_model (Path) – Untrained learner
dataset (Path) – Training data
label_column (str) – Select the column that contains the label or outcome column
model_explanations (bool) – Whether to generate explanations for the trained model. Default is unchecked to reduce extra compute overhead. (optional)
- Output trained_model
Trained learner
- Type
trained_model: Output
- example.assets.example_components.azureml_train_pytorch_model(untrained_model: Optional[pathlib.Path] = None, training_dataset: Optional[pathlib.Path] = None, validation_dataset: Optional[pathlib.Path] = None, epochs: int = 5, batch_size: int = 16, warmup_step_number: int = 0, learning_rate: float = 0.001, random_seed: int = 1, patience: int = 3, print_frequency: int = 10) example.assets.example_components._assets._AzuremlTrainPytorchModelComponent
Train pytorch model from scratch or finetune it.
- Parameters
untrained_model (Path) – Untrained model
training_dataset (Path) – Input dataset for training
validation_dataset (Path) – Input dataset for validation
epochs (int) – Epochs. (min: 1)
batch_size (int) – Batch size. (min: 1)
warmup_step_number (int) – Warmup step number (optional)
learning_rate (float) – Learning rate. (min: 2.220446049250313e-16, max: 2.0)
random_seed (int) – Random seed.
patience (int) – Patience. (min: 1)
print_frequency (int) – Training log print frequency over iterations in each epoch. (optional, min: 1)
- Output trained_model
Trained model
- Type
trained_model: Output
- example.assets.example_components.azureml_train_svd_recommender(training_dataset_of_user_item_rating_triples: Optional[pathlib.Path] = None, number_of_factors: int = 200, number_of_recommendation_algorithm_iterations: int = 30, learning_rate: float = 0.005) example.assets.example_components._assets._AzuremlTrainSvdRecommenderComponent
Train a collaborative filtering recommendation using SVD algorithm.
- Parameters
training_dataset_of_user_item_rating_triples (Path) – Ratings of items by users, expressed as triple (User, Item, Rating)
number_of_factors (int) – Specify the number of factors to use with recommendation (min: 1)
number_of_recommendation_algorithm_iterations (int) – Specify the maximum number of iterations to perform while training the recommendation model (min: 1)
learning_rate (float) – Specify the size of each step in the learning process (min: 2.220446049250313e-16, max: 2.0)
- Output trained_svd_recommendation
Trained SVD recommendation
- Type
trained_svd_recommendation: Output
- example.assets.example_components.azureml_train_vowpal_wabbit_model(pre_trained_vowpal_wabbit_model: Optional[pathlib.Path] = None, training_data: Optional[pathlib.Path] = None, vw_arguments: Optional[str] = None, name_of_the_training_data_file: Optional[str] = None, specify_file_type: example.assets.example_components._assets._AzuremlTrainVowpalWabbitModelSpecifyFileTypeEnum = _AzuremlTrainVowpalWabbitModelSpecifyFileTypeEnum.vw, output_readable_model_file: bool = False, output_inverted_hash_file: bool = False) example.assets.example_components._assets._AzuremlTrainVowpalWabbitModelComponent
Train a Vowpal Wabbit model using the command line interface.
- Parameters
pre_trained_vowpal_wabbit_model (Path) – Trained Vowpal Wabbit model.(optional)
training_data (Path) – Training data.
vw_arguments (str) – Type vowpal wabbit command line arguments. (optional)
name_of_the_training_data_file (str) – Type name of the training data file. (optional)
specify_file_type (_AzuremlTrainVowpalWabbitModelSpecifyFileTypeEnum) – Please specify file type. (enum: [‘VW’, ‘SVMLight’])
output_readable_model_file (bool) – Output readable model (–readable_model) file.
output_inverted_hash_file (bool) – Output inverted hash (–invert_hash) file.
- Output trained_vowpal_wabbit_model
Trained Vowpal Wabbit model
- Type
trained_vowpal_wabbit_model: Output
- example.assets.example_components.azureml_train_wide_and_deep_recommender(training_dataset_of_user_item_rating_triples: Optional[pathlib.Path] = None, user_features: Optional[pathlib.Path] = None, item_features: Optional[pathlib.Path] = None, epochs: int = 15, batch_size: int = 64, wide_part_optimizer: example.assets.example_components._assets._AzuremlTrainWideAndDeepRecommenderWidePartOptimizerEnum = _AzuremlTrainWideAndDeepRecommenderWidePartOptimizerEnum.adagrad, wide_optimizer_learning_rate: float = 0.1, crossed_feature_dimension: int = 1000, deep_part_optimizer: example.assets.example_components._assets._AzuremlTrainWideAndDeepRecommenderDeepPartOptimizerEnum = _AzuremlTrainWideAndDeepRecommenderDeepPartOptimizerEnum.adagrad, deep_optimizer_learning_rate: float = 0.1, user_embedding_dimension: int = 16, item_embedding_dimension: int = 16, categorical_features_embedding_dimension: int = 4, hidden_units: str = '256,128', activation_function: example.assets.example_components._assets._AzuremlTrainWideAndDeepRecommenderActivationFunctionEnum = _AzuremlTrainWideAndDeepRecommenderActivationFunctionEnum.relu, dropout: float = 0.8, batch_normalization: bool = True) example.assets.example_components._assets._AzuremlTrainWideAndDeepRecommenderComponent
Train a recommender based on Wide & Deep model.
- Parameters
training_dataset_of_user_item_rating_triples (Path) – Ratings of items by users, expressed as triple (User, Item, Rating)
user_features (Path) – User features(optional)
item_features (Path) – Item features(optional)
epochs (int) – Maximum number of epochs to perform while training (min: 1)
batch_size (int) – Number of consecutive samples to combine in a single batch (min: 1)
wide_part_optimizer (_AzuremlTrainWideAndDeepRecommenderWidePartOptimizerEnum) – Optimizer used to apply gradients to the wide part of the model (enum: [‘Adagrad’, ‘Adam’, ‘Ftrl’, ‘RMSProp’, ‘SGD’, ‘Adadelta’])
wide_optimizer_learning_rate (float) – Size of each step in the learning process for wide part of the model (min: 2.220446049250313e-16, max: 2.0)
crossed_feature_dimension (int) – Crossed feature dimension for wide part model (min: 1)
deep_part_optimizer (_AzuremlTrainWideAndDeepRecommenderDeepPartOptimizerEnum) – Optimizer used to apply gradients to the deep part of the model (enum: [‘Adagrad’, ‘Adam’, ‘Ftrl’, ‘RMSProp’, ‘SGD’, ‘Adadelta’])
deep_optimizer_learning_rate (float) – Size of each step in the learning process for deep part of the model (min: 2.220446049250313e-16, max: 2.0)
user_embedding_dimension (int) – User embedding dimension for deep part model (min: 1)
item_embedding_dimension (int) – Item embedding dimension for deep part model (min: 1)
categorical_features_embedding_dimension (int) – Categorical features embedding dimension for deep part model (min: 1)
hidden_units (str) – Hidden units per layer for deep part model
activation_function (_AzuremlTrainWideAndDeepRecommenderActivationFunctionEnum) – Activation function applied to each layer in deep part model (enum: [‘ReLU’, ‘Sigmoid’, ‘Tanh’, ‘Linear’, ‘LeakyReLU’])
dropout (float) – Probability that each element is dropped in deep part model (max: 1.0)
batch_normalization (bool) – Whether to use batch normalization after each hidden layer
- Output trained_wide_and_deep_recommendation_model
Trained Wide and Deep recommendation model
- Type
trained_wide_and_deep_recommendation_model: Output
- example.assets.example_components.azureml_tune_model_hyperparameters(untrained_model: Optional[pathlib.Path] = None, training_dataset: Optional[pathlib.Path] = None, optional_validation_dataset: Optional[pathlib.Path] = None, specify_parameter_sweeping_mode: example.assets.example_components._assets._AzuremlTuneModelHyperparametersSpecifyParameterSweepingModeEnum = _AzuremlTuneModelHyperparametersSpecifyParameterSweepingModeEnum.random_sweep, maximum_number_of_runs_on_random_sweep: int = 5, random_seed: int = 0, name_or_numerical_index_of_the_label_column: Optional[str] = None, metric_for_measuring_performance_for_classification: example.assets.example_components._assets._AzuremlTuneModelHyperparametersMetricForMeasuringPerformanceForClassificationEnum = _AzuremlTuneModelHyperparametersMetricForMeasuringPerformanceForClassificationEnum.accuracy, metric_for_measuring_performance_for_regression: example.assets.example_components._assets._AzuremlTuneModelHyperparametersMetricForMeasuringPerformanceForRegressionEnum = _AzuremlTuneModelHyperparametersMetricForMeasuringPerformanceForRegressionEnum.mean_absolute_error) example.assets.example_components._assets._AzuremlTuneModelHyperparametersComponent
Perform a parameter sweep on the model to determine the optimum parameter settings.
- Parameters
untrained_model (Path) – Untrained model for parameter sweep
training_dataset (Path) – Input dataset for training
optional_validation_dataset (Path) – Input dataset for validation (for Train/Test validation mode)(optional)
specify_parameter_sweeping_mode (_AzuremlTuneModelHyperparametersSpecifyParameterSweepingModeEnum) – Sweep entire grid on parameter space, or sweep with using a limited number of sample runs (enum: [‘Entire grid’, ‘Random sweep’])
maximum_number_of_runs_on_random_sweep (int) – Execute maximum number of runs using random sweep (optional, min: 1, max: 10000)
random_seed (int) – Provide a value to seed the random number generator (optional, max: 4294967295)
name_or_numerical_index_of_the_label_column (str) – Label column
metric_for_measuring_performance_for_classification (_AzuremlTuneModelHyperparametersMetricForMeasuringPerformanceForClassificationEnum) – Select the metric used for evaluating classification models (enum: [‘Accuracy’, ‘Precision’, ‘Recall’, ‘F-score’, ‘AUC’, ‘Average Log Loss’])
metric_for_measuring_performance_for_regression (_AzuremlTuneModelHyperparametersMetricForMeasuringPerformanceForRegressionEnum) – Select the metric used for evaluating regression models (enum: [‘Mean absolute error’, ‘Root of mean squared error’, ‘Relative absolute error’, ‘Relative squared error’, ‘Coefficient of determination’])
- Output sweep_results
Results metric for parameter sweep runs
- Type
sweep_results: Output
- Output trained_best_model
Model with best performance on the training dataset
- Type
trained_best_model: Output
- example.assets.example_components.azureml_two_class_averaged_perceptron(create_trainer_mode: example.assets.example_components._assets._AzuremlTwoClassAveragedPerceptronCreateTrainerModeEnum = _AzuremlTwoClassAveragedPerceptronCreateTrainerModeEnum.singleparameter, initial_learning_rate: float = 1.0, maximum_number_of_iterations: int = 10, range_for_initial_learning_rate: str = '0.1; 0.5; 1.0', range_for_maximum_number_of_iterations: str = '1; 10', random_number_seed: Optional[int] = None) example.assets.example_components._assets._AzuremlTwoClassAveragedPerceptronComponent
Creates an averaged perceptron binary classification model.
- Parameters
create_trainer_mode (_AzuremlTwoClassAveragedPerceptronCreateTrainerModeEnum) – Create advanced learner options (enum: [‘SingleParameter’, ‘ParameterRange’])
initial_learning_rate (float) – The initial learning rate for the Stochastic Gradient Descent optimizer. (optional, min: 2.220446049250313e-16)
maximum_number_of_iterations (int) – The number of Stochastic Gradient Descent iterations to be performed over the training dataset. (optional, min: 1)
range_for_initial_learning_rate (str) – Range for initial learning rate for the Stochastic Gradient Descent optimizer. (optional)
range_for_maximum_number_of_iterations (str) – Range for the number of Stochastic Gradient Descent iterations to be performed over the training dataset. (optional)
random_number_seed (int) – The seed for the random number generator used by the model. Leave blank for default. (optional, max: 4294967295)
- Output untrained_model
An untrained binary classification model that can be connected to the Create One-vs-All Multi-class Classifier or Train Generic Model or Cross Validate Model modules.
- Type
untrained_model: Output
- example.assets.example_components.azureml_two_class_boosted_decision_tree(create_trainer_mode: example.assets.example_components._assets._AzuremlTwoClassBoostedDecisionTreeCreateTrainerModeEnum = _AzuremlTwoClassBoostedDecisionTreeCreateTrainerModeEnum.singleparameter, maximum_number_of_leaves_per_tree: int = 20, minimum_number_of_training_instances_required_to_form_a_leaf: int = 10, the_learning_rate: float = 0.2, total_number_of_trees_constructed: int = 100, range_for_maximum_number_of_leaves_per_tree: str = '2; 8; 32; 128', range_for_minimum_number_of_training_instances_required_to_form_a_leaf: str = '1; 10; 50', range_for_learning_rate: str = '0.025; 0.05; 0.1; 0.2; 0.4', range_for_total_number_of_trees_constructed: str = '20; 100; 500', random_number_seed: Optional[int] = None) example.assets.example_components._assets._AzuremlTwoClassBoostedDecisionTreeComponent
Creates a binary classifier using a boosted decision tree algorithm.
- Parameters
create_trainer_mode (_AzuremlTwoClassBoostedDecisionTreeCreateTrainerModeEnum) – Create advanced learner options (enum: [‘SingleParameter’, ‘ParameterRange’])
maximum_number_of_leaves_per_tree (int) – Specify the maximum number of leaves allowed per tree (optional, min: 2, max: 131072)
minimum_number_of_training_instances_required_to_form_a_leaf (int) – Specify the minimum number of cases required to form a leaf (optional, min: 1)
the_learning_rate (float) – Specify the initial learning rate (optional, min: 2.220446049250313e-16, max: 1.0)
total_number_of_trees_constructed (int) – Specify the maximum number of trees that can be created during training (optional, min: 1)
range_for_maximum_number_of_leaves_per_tree (str) – Specify range for the maximum number of leaves allowed per tree (optional)
range_for_minimum_number_of_training_instances_required_to_form_a_leaf (str) – Specify the range for the minimum number of cases required to form a leaf (optional)
range_for_learning_rate (str) – Specify the range for the initial learning rate (optional)
range_for_total_number_of_trees_constructed (str) – Specify the range for the maximum number of trees that can be created during training (optional)
random_number_seed (int) – Type a value to seed the random number generator used by the model. Leave blank for default. (optional, max: 4294967295)
- Output untrained_model
An untrained binary classification model
- Type
untrained_model: Output
- example.assets.example_components.azureml_two_class_decision_forest(create_trainer_mode: example.assets.example_components._assets._AzuremlTwoClassDecisionForestCreateTrainerModeEnum = _AzuremlTwoClassDecisionForestCreateTrainerModeEnum.singleparameter, number_of_decision_trees: int = 8, maximum_depth_of_the_decision_trees: int = 32, minimum_number_of_samples_per_leaf_node: int = 1, range_for_number_of_decision_trees: str = '1; 8; 32', range_for_the_maximum_depth_of_the_decision_trees: str = '1; 16; 64', range_for_the_minimum_number_of_samples_per_leaf_node: str = '1; 4; 16', resampling_method: example.assets.example_components._assets._AzuremlTwoClassDecisionForestResamplingMethodEnum = _AzuremlTwoClassDecisionForestResamplingMethodEnum.bagging_resampling) example.assets.example_components._assets._AzuremlTwoClassDecisionForestComponent
Creates a two-class classification model using the decision forest algorithm.
- Parameters
create_trainer_mode (_AzuremlTwoClassDecisionForestCreateTrainerModeEnum) – Create advanced learner options (enum: [‘SingleParameter’, ‘ParameterRange’])
number_of_decision_trees (int) – Specify the number of decision trees to create in the ensemble (optional, min: 1)
maximum_depth_of_the_decision_trees (int) – Specify the maximum depth of any decision tree that can be created in the ensemble (optional, min: 1)
minimum_number_of_samples_per_leaf_node (int) – Specify the minimum number of training samples required to generate a leaf node (optional, min: 1)
range_for_number_of_decision_trees (str) – Specify range for the number of decision trees to create in the ensemble (optional)
range_for_the_maximum_depth_of_the_decision_trees (str) – Specify range for the maximum depth of the decision trees (optional)
range_for_the_minimum_number_of_samples_per_leaf_node (str) – Specify range for the minimum number of samples per leaf node (optional)
resampling_method (_AzuremlTwoClassDecisionForestResamplingMethodEnum) – Choose a resampling method (enum: [‘Bagging Resampling’, ‘Replicate Resampling’])
- Output untrained_model
An untrained binary classification model
- Type
untrained_model: Output
- example.assets.example_components.azureml_two_class_logistic_regression(create_trainer_mode: example.assets.example_components._assets._AzuremlTwoClassLogisticRegressionCreateTrainerModeEnum = _AzuremlTwoClassLogisticRegressionCreateTrainerModeEnum.singleparameter, optimization_tolerance: float = 1e-07, l2_regularizaton_weight: float = 1.0, range_for_optimization_tolerance: str = '0.00001; 0.00000001', range_for_l2_regularization_weight: str = '0.01; 0.1; 1.0', random_number_seed: Optional[int] = None) example.assets.example_components._assets._AzuremlTwoClassLogisticRegressionComponent
Creates a two-class logistic regression model.
- Parameters
create_trainer_mode (_AzuremlTwoClassLogisticRegressionCreateTrainerModeEnum) – Create advanced learner options (enum: [‘SingleParameter’, ‘ParameterRange’])
optimization_tolerance (float) – Specify a tolerance value for the L-BFGS optimizer (optional, min: 2.220446049250313e-16)
l2_regularizaton_weight (float) – Specify the L2 regularization weight. Use a non-zero value to avoid overfitting. (optional)
range_for_optimization_tolerance (str) – Specify a range for the tolerance value for the L-BFGS optimizer (optional)
range_for_l2_regularization_weight (str) – Specify the range for the L2 regularization weight. Use a non-zero value to avoid overfitting. (optional)
random_number_seed (int) – Type a value to seed the random number generator used by the model. Leave blank for default. (optional, max: 4294967295)
- Output untrained_model
An untrained classification model
- Type
untrained_model: Output
- example.assets.example_components.azureml_two_class_neural_network(create_trainer_mode: example.assets.example_components._assets._AzuremlTwoClassNeuralNetworkCreateTrainerModeEnum = _AzuremlTwoClassNeuralNetworkCreateTrainerModeEnum.singleparameter, hidden_layer_specification: example.assets.example_components._assets._AzuremlTwoClassNeuralNetworkHiddenLayerSpecificationEnum = _AzuremlTwoClassNeuralNetworkHiddenLayerSpecificationEnum.fully_connected_case, number_of_hidden_nodes: str = '100', the_learning_rate: float = 0.1, number_of_learning_iterations: int = 100, hidden_layer_specification1: example.assets.example_components._assets._AzuremlTwoClassNeuralNetworkHiddenLayerSpecification1Enum = _AzuremlTwoClassNeuralNetworkHiddenLayerSpecification1Enum.fully_connected_case, number_of_hidden_nodes1: str = '100', range_for_learning_rate: str = '0.1; 0.2; 0.4', range_for_number_of_learning_iterations: str = '20; 40; 80; 160', the_momentum: float = 0, shuffle_examples: bool = True, random_number_seed: Optional[int] = None) example.assets.example_components._assets._AzuremlTwoClassNeuralNetworkComponent
Creates a binary classifier using a neural network algorithm.
- Parameters
create_trainer_mode (_AzuremlTwoClassNeuralNetworkCreateTrainerModeEnum) – Create advanced learner options (enum: [‘SingleParameter’, ‘ParameterRange’])
hidden_layer_specification (_AzuremlTwoClassNeuralNetworkHiddenLayerSpecificationEnum) – Specify the architecture of the hidden layer or layers (optional, enum: [‘Fully-connected case’])
number_of_hidden_nodes (str) – Type the number of nodes in the hidden layer. For multiple hidden layers, type a comma-separated list. (optional)
the_learning_rate (float) – Specify the size of each step in the learning process (optional, min: 2.220446049250313e-16, max: 2.0)
number_of_learning_iterations (int) – Specify the number of iterations while learning (optional, min: 1)
hidden_layer_specification1 (_AzuremlTwoClassNeuralNetworkHiddenLayerSpecification1Enum) – Specify the architecture of the hidden layer or layers for range (optional, enum: [‘Fully-connected case’])
number_of_hidden_nodes1 (str) – Type the number of nodes in the hidden layer, or for multiple hidden layers, type a comma-separated list. (optional)
range_for_learning_rate (str) – Specify the range for the size of each step in the learning process (optional)
range_for_number_of_learning_iterations (str) – Specify the range for the number of iterations while learning (optional)
the_momentum (float) – Specify a weight to apply during learning to nodes from previous iterations (max: 1.0)
shuffle_examples (bool) – Select this option to change the order of instances between learning iterations
random_number_seed (int) – Specify a numeric seed to use for random number generation. Leave blank to use the default seed. (optional, max: 4294967295)
- Output untrained_model
An untrained binary classification model
- Type
untrained_model: Output
- example.assets.example_components.azureml_two_class_support_vector_machine(create_trainer_mode: example.assets.example_components._assets._AzuremlTwoClassSupportVectorMachineCreateTrainerModeEnum = _AzuremlTwoClassSupportVectorMachineCreateTrainerModeEnum.singleparameter, number_of_iterations: int = 10, the_value_lambda: float = 0.001, range_for_number_of_iterations: str = '1; 10; 100', range_for_lambda: str = '0.00001; 0.0001; 0.001; 0.01; 0.1', normalize_the_features: bool = True, random_number_seed: Optional[int] = None) example.assets.example_components._assets._AzuremlTwoClassSupportVectorMachineComponent
Creates a binary classification model using the Support Vector Machine algorithm.
- Parameters
create_trainer_mode (_AzuremlTwoClassSupportVectorMachineCreateTrainerModeEnum) – Create advanced learner options (enum: [‘SingleParameter’, ‘ParameterRange’])
number_of_iterations (int) – The number of iterations. (optional, min: 1)
the_value_lambda (float) – Weight for L1 regularization. Using a non-zero value avoids overfitting the model to the training dataset. (optional, min: 2.220446049250313e-16)
range_for_number_of_iterations (str) – The range for the number of iterations. (optional)
range_for_lambda (str) – Weight range for the for L1 regularization. Using a non-zero value avoids overfitting the model to the training dataset. (optional)
normalize_the_features (bool) – If true normalize the features.
random_number_seed (int) – The seed for the random number generator used by the model. Leave blank for default. (optional, max: 4294967295)
- Output untrained_model
An untrained binary classification model that can be connected to the Create One-vs-All Multiclass Classification Model or Train Generic Model or Cross Validate Model modules.
- Type
untrained_model: Output
- example.assets.example_components.bing_relevance_convert2ss(TextData: Optional[pathlib.Path] = None, ExtractionClause: Optional[str] = None) example.assets.example_components._assets._BingRelevanceConvert2SsComponent
Convert ADLS test data to SS format
- Parameters
TextData (Path) – relative path on ADLS storage
ExtractionClause (str) – the extraction clause, something like “column1:string, column2:int”
- Output SSPath
output path of ss
- Type
SSPath: Output
- example.assets.example_components.bing_relevance_convert2ss_isresource(TextData: Optional[pathlib.Path] = None, ExtractionClause: Optional[str] = None) example.assets.example_components._assets._BingRelevanceConvert2SsIsresourceComponent
Convert ADLS test data to SS format
- Parameters
TextData (Path) – relative path on ADLS storage
ExtractionClause (str) – the extraction clause, something like “column1:string, column2:int”
- Output SSPath
output path of ss
- Type
SSPath: Output
- example.assets.example_components.fine_tune_for_huggingface_text_classification(model: Optional[pathlib.Path] = None, dataset: Optional[pathlib.Path] = None, max_seq_length: int = 128, per_device_train_batch_size: int = 8, learning_rate: float = 5e-05, num_train_epochs: int = 1) example.assets.example_components._assets._FineTuneForHuggingfaceTextClassificationComponent
- Parameters
model (Path) – path
dataset (Path) – path
max_seq_length (int) – The maximum total input sequence length after tokenization. Sequences longer than this will be truncated, sequences shorter will be padded. (optional)
per_device_train_batch_size (int) – Batch size per GPU/TPU core/CPU for training. (optional)
learning_rate (float) – The initial learning rate for AdamW. (optional)
num_train_epochs (int) – Total number of training epochs to perform. (optional)
- Output output_model
path
- Type
output_model: Output
- example.assets.example_components.fine_tune_for_huggingface_text_generation(model: Optional[pathlib.Path] = None, dataset: Optional[pathlib.Path] = None, per_device_train_batch_size: int = 8, learning_rate: float = 5e-05, num_train_epochs: int = 1) example.assets.example_components._assets._FineTuneForHuggingfaceTextGenerationComponent
- Parameters
model (Path) – path
dataset (Path) – path
per_device_train_batch_size (int) – Batch size per GPU/TPU core/CPU for training. (optional)
learning_rate (float) – The initial learning rate for AdamW. (optional)
num_train_epochs (int) – Total number of training epochs to perform. (optional)
- Output output_model
path
- Type
output_model: Output
- example.assets.example_components.fine_tune_for_huggingface_token_classification(model: Optional[pathlib.Path] = None, dataset: Optional[pathlib.Path] = None, max_seq_length: int = 128, per_device_train_batch_size: int = 8, learning_rate: float = 5e-05, num_train_epochs: int = 1) example.assets.example_components._assets._FineTuneForHuggingfaceTokenClassificationComponent
- Parameters
model (Path) – path
dataset (Path) – path
max_seq_length (int) – The maximum total input sequence length after tokenization. Sequences longer than this will be truncated, sequences shorter will be padded. (optional)
per_device_train_batch_size (int) – Batch size per GPU/TPU core/CPU for training. (optional)
learning_rate (float) – The initial learning rate for AdamW. (optional)
num_train_epochs (int) – Total number of training epochs to perform. (optional)
- Output output_model
path
- Type
output_model: Output
- example.assets.example_components.microsoft_com_azureml_samples_hello_world_with_cpu_image(input_path: Optional[pathlib.Path] = None, string_parameter: Optional[str] = None) example.assets.example_components._assets._MicrosoftComAzuremlSamplesHelloWorldWithCpuImageComponent
A hello world tutorial to create a module for ml.azure.com.
- Parameters
input_path (Path) – The directory contains dataframe.
string_parameter (str) – A parameter accepts a string value. (optional)
- Output output_path
The directory contains a dataframe.
- Type
output_path: Output
- example.assets.example_components.microsoft_com_azureml_samples_parallel_copy_files_v1(input_folder: Optional[pathlib.Path] = None) example.assets.example_components._assets._MicrosoftComAzuremlSamplesParallelCopyFilesV1Component
A sample Parallel module to copy files.
- Parameters
input_folder (Path) – AnyDirectory
- Output output_folder
Output files
- Type
output_folder: Output
- example.assets.example_components.microsoft_com_azureml_samples_sweep_train(training_data: Optional[pathlib.Path] = None, max_epochs: Optional[int] = None, learning_rate: Optional[float] = None, subsample: Optional[float] = None) example.assets.example_components._assets._MicrosoftComAzuremlSamplesSweepTrainComponent
A dummy train component
- Parameters
training_data (Path) – Training data organized in the torchvision format/structure
max_epochs (int) – Maximum number of epochs for the training
learning_rate (float) – learning_rate (min: 0.001, max: 0.1)
subsample (float) – learning_rate (min: 0.1, max: 0.5)
- Output saved_model
path
- Type
saved_model: Output
- Output other_output
path
- Type
other_output: Output
- example.assets.example_components.microsoft_com_azureml_samples_train_in_spark(input_path: Optional[pathlib.Path] = None, regularization_rate: float = 0.01) example.assets.example_components._assets._MicrosoftComAzuremlSamplesTrainInSparkComponent
Train a Spark ML model using an HDInsight Spark cluster
- Parameters
input_path (Path) – Iris csv file
regularization_rate (float) – Regularization rate when training with logistic regression (optional)
- Output output_path
The output path to save the trained model to
- Type
output_path: Output
- example.assets.example_components.microsoft_com_azureml_samples_tune(training_data: Optional[pathlib.Path] = None, max_epochs: Optional[int] = None, learning_rate: Optional[float] = None, subsample: Optional[float] = None) example.assets.example_components._assets._MicrosoftComAzuremlSamplesTuneComponent
A dummy hyperparameter tuning component
- Parameters
training_data (Path) – Training data organized in the torchvision format/structure
max_epochs (int) – Maximum number of epochs for the training
learning_rate (float) – learning_rate (min: 0.001, max: 0.1)
subsample (float) – learning_rate (min: 0.1, max: 0.5)
- Output best_model
model
- Type
best_model: Output
- Output saved_model
path
- Type
saved_model: Output
- Output other_output
path
- Type
other_output: Output
- example.assets.example_components.score_for_huggingface_text_classification(model: Optional[pathlib.Path] = None, dataset: Optional[pathlib.Path] = None, max_seq_length: int = 128) example.assets.example_components._assets._ScoreForHuggingfaceTextClassificationComponent
- Parameters
model (Path) – path
dataset (Path) – path
max_seq_length (int) – The maximum total input sequence length after tokenization. Sequences longer than this will be truncated, sequences shorter will be padded. (optional)
- Output output_dir
path
- Type
output_dir: Output
- example.assets.example_components.score_for_huggingface_text_generation(model: Optional[pathlib.Path] = None, dataset: Optional[pathlib.Path] = None) example.assets.example_components._assets._ScoreForHuggingfaceTextGenerationComponent
- Parameters
model (Path) – path
dataset (Path) – path
- Output output_dir
path
- Type
output_dir: Output
- example.assets.example_components.score_for_huggingface_token_classification(model: Optional[pathlib.Path] = None, dataset: Optional[pathlib.Path] = None, max_seq_length: int = 128) example.assets.example_components._assets._ScoreForHuggingfaceTokenClassificationComponent
- Parameters
model (Path) – path
dataset (Path) – path
max_seq_length (int) – The maximum total input sequence length after tokenization. Sequences longer than this will be truncated, sequences shorter will be padded. (optional)
- Output output_dir
path
- Type
output_dir: Output
- example.assets.example_components.sweep_for_huggingface_text_classification(model: Optional[pathlib.Path] = None, dataset: Optional[pathlib.Path] = None, max_seq_length: Optional[int] = None, per_device_train_batch_size: int = 8, learning_rate: Optional[float] = None, num_train_epochs: int = 1) example.assets.example_components._assets._SweepForHuggingfaceTextClassificationComponent
- Parameters
model (Path) – path
dataset (Path) – path
max_seq_length (int) – The maximum total input sequence length after tokenization. Sequences longer than this will be truncated, sequences shorter will be padded. (optional)
per_device_train_batch_size (int) – Batch size per GPU/TPU core/CPU for training. (optional)
learning_rate (float) – The initial learning rate for AdamW. (optional)
num_train_epochs (int) – Total number of training epochs to perform. (optional)
- Output output_model
path
- Type
output_model: Output
- example.assets.example_components.sweep_for_huggingface_text_generation(model: Optional[pathlib.Path] = None, dataset: Optional[pathlib.Path] = None, per_device_train_batch_size: int = 8, learning_rate: Optional[float] = None, num_train_epochs: int = 1) example.assets.example_components._assets._SweepForHuggingfaceTextGenerationComponent
- Parameters
model (Path) – path
dataset (Path) – path
per_device_train_batch_size (int) – Batch size per GPU/TPU core/CPU for training. (optional)
learning_rate (float) – The initial learning rate for AdamW. (optional)
num_train_epochs (int) – Total number of training epochs to perform. (optional)
- Output output_model
path
- Type
output_model: Output
- example.assets.example_components.sweep_for_huggingface_token_classification(model: Optional[pathlib.Path] = None, dataset: Optional[pathlib.Path] = None, max_seq_length: Optional[int] = None, per_device_train_batch_size: int = 8, learning_rate: Optional[float] = None, num_train_epochs: int = 1) example.assets.example_components._assets._SweepForHuggingfaceTokenClassificationComponent
- Parameters
model (Path) – path
dataset (Path) – path
max_seq_length (int) – The maximum total input sequence length after tokenization. Sequences longer than this will be truncated, sequences shorter will be padded. (optional)
per_device_train_batch_size (int) – Batch size per GPU/TPU core/CPU for training. (optional)
learning_rate (float) – The initial learning rate for AdamW. (optional)
num_train_epochs (int) – Total number of training epochs to perform. (optional)
- Output output_model
path
- Type
output_model: Output