Developer tooling¶

treecat.profile¶

treecat.profile.eval(rows=100, cols=10, cats=4, tool='timers')¶: Profile treecat.validate.eval on a random dataset. Available tools: timers, time, snakeviz, line_profiler, pdb

treecat.profile.serve(rows=100, cols=10, cats=4, tool='timers')¶: Profile TreeCatServer on a random dataset. Available tools: timers, time, snakeviz, line_profiler, pdb

treecat.profile.serve_files(model_path, config_path, num_samples)¶: INTERNAL Serve from pickled model, config.

treecat.profile.train(rows=100, cols=10, epochs=5, clusters=32, parallel=False, tool='timers')¶: Profile TreeCatTrainer on a random dataset. Available tools: timers, time, snakeviz, line_profiler, pdb

treecat.profile.train_files(dataset_path, config_path)¶: INTERNAL Train from pickled dataset, config.

treecat.generate¶

treecat.generate.clean()¶: Clean out cache of generated datasets.

treecat.generate.generate_clean_dataset(tree, num_rows, num_cats)¶

Generate a dataset whose structure should be easy to learn.

This generates a highly correlated uniformly distributed dataset with given tree structure. This is useful to test that structure learning can recover a known structure.

Args:: tree: A TreeStructure instance. num_rows: The number of rows in the generated dataset. num_cats: The number of categories in the geneated categorical dataset.

This will also be used for the number of latent classes.
Returns:: A dict with key ‘table’ and value a Table object.

treecat.generate.generate_dataset(num_rows, num_cols, num_cats=4, rate=1.0)¶

Generate a random dataset.

Returns:: A dataset dict with fields ‘schema’ and ‘table’.

treecat.generate.generate_dataset_file(num_rows, num_cols, num_cats=4, rate=1.0)¶

Generate a random dataset.

Returns:: The path to a gzipped pickled data table.

treecat.generate.generate_model_file(num_rows, num_cols, num_cats=4, rate=1.0)¶

Generate a random model.

Returns:: The path to a gzipped pickled model.