Developer tooling¶
treecat.profile¶
-
treecat.profile.
eval
(rows=100, cols=10, cats=4, tool='timers')¶ Profile treecat.validate.eval on a random dataset. Available tools: timers, time, snakeviz, line_profiler, pdb
-
treecat.profile.
serve
(rows=100, cols=10, cats=4, tool='timers')¶ Profile TreeCatServer on a random dataset. Available tools: timers, time, snakeviz, line_profiler, pdb
-
treecat.profile.
serve_files
(model_path, config_path, num_samples)¶ INTERNAL Serve from pickled model, config.
-
treecat.profile.
train
(rows=100, cols=10, epochs=5, clusters=32, parallel=False, tool='timers')¶ Profile TreeCatTrainer on a random dataset. Available tools: timers, time, snakeviz, line_profiler, pdb
-
treecat.profile.
train_files
(dataset_path, config_path)¶ INTERNAL Train from pickled dataset, config.
treecat.generate¶
-
treecat.generate.
clean
()¶ Clean out cache of generated datasets.
-
treecat.generate.
generate_clean_dataset
(tree, num_rows, num_cats)¶ Generate a dataset whose structure should be easy to learn.
This generates a highly correlated uniformly distributed dataset with given tree structure. This is useful to test that structure learning can recover a known structure.
- Args:
tree: A TreeStructure instance. num_rows: The number of rows in the generated dataset. num_cats: The number of categories in the geneated categorical dataset.
This will also be used for the number of latent classes.- Returns:
- A dict with key ‘table’ and value a Table object.
-
treecat.generate.
generate_dataset
(num_rows, num_cols, num_cats=4, rate=1.0)¶ Generate a random dataset.
- Returns:
- A dataset dict with fields ‘schema’ and ‘table’.
-
treecat.generate.
generate_dataset_file
(num_rows, num_cols, num_cats=4, rate=1.0)¶ Generate a random dataset.
- Returns:
- The path to a gzipped pickled data table.
-
treecat.generate.
generate_model_file
(num_rows, num_cols, num_cats=4, rate=1.0)¶ Generate a random model.
- Returns:
- The path to a gzipped pickled model.