Replacing outliers
replace_outliers
ReplaceCooksOutliers
Bases: ComputeTrimmedMean
, LocFindCooksOutliers
, AggMergeOutlierGenes
, LocReplaceCooksOutliers
, LocSetRefitAdata
, AggNewAllZeros
, LocSetNewAllZerosAndGetFeatures
Mixin class to replace Cook's outliers.
Source code in fedpydeseq2/core/deseq2_core/replace_outliers/replace_outliers.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 |
|
replace_outliers(train_data_nodes, aggregation_node, local_states, cooks_shared_state, round_idx, clean_models)
Replace outlier counts.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
train_data_nodes
|
List of TrainDataNode. |
required | |
aggregation_node
|
The aggregation node. |
required | |
local_states
|
Local states. Required to propagate intermediate results. |
required | |
cooks_shared_state
|
Shared state with the dispersion values for Cook's distances, in a "cooks_dispersions" key. |
required | |
round_idx
|
Index of the current round. |
required | |
clean_models
|
Whether to clean the models after the computation. |
required |
Returns:
Name | Type | Description |
---|---|---|
local_states |
dict
|
Local states. The new local state contains Cook's distances. |
shared_states |
list[dict]
|
List of shared states with the features vector to input to compute_genewise_dispersion in a "local_features" key. |
round_idx |
int
|
The updated round index. |
Source code in fedpydeseq2/core/deseq2_core/replace_outliers/replace_outliers.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 |
|
substeps
AggMergeOutlierGenes
Build the global list of genes to replace.
Source code in fedpydeseq2/core/deseq2_core/replace_outliers/substeps.py
agg_merge_outlier_genes(shared_states)
Merge the lists of genes to replace.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
shared_states
|
list
|
List of dictionaries containing: - "local_genes_to_replace": genes with Cook's distance above the threshold, - "replaceable_samples": a boolean indicating whether there is at least one sample with enough replicates to replace it. |
required |
Returns:
Type | Description |
---|---|
dict
|
A dictionary with a unique key: "genes_to_replace" containing the list of genes for which to replace outlier values. |
Source code in fedpydeseq2/core/deseq2_core/replace_outliers/substeps.py
AggNewAllZeros
Mixin to compute the new all zeros and share to the centers.
Source code in fedpydeseq2/core/deseq2_core/replace_outliers/substeps.py
aggregate_new_all_zeros(shared_states)
Compute the global mean given the local results.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
shared_states
|
list
|
List of results (local_mean, n_samples) from training nodes. In refit mode, also contains "loc_new_all_zero". |
required |
Returns:
Type | Description |
---|---|
dict
|
New all-zero genes. |
Source code in fedpydeseq2/core/deseq2_core/replace_outliers/substeps.py
LocFindCooksOutliers
Find local Cooks outliers.
Source code in fedpydeseq2/core/deseq2_core/replace_outliers/substeps.py
loc_find_cooks_outliers(data_from_opener, shared_state)
Find local Cooks outliers by comparing the cooks distance to a threshold.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_from_opener
|
AnnData
|
AnnData returned by the opener. Not used. |
required |
shared_state
|
dict
|
Not used. |
required |
Returns:
Type | Description |
---|---|
dict
|
Shared state containing: - "local_genes_to_replace": genes with Cook's distance above the threshold, - "replaceable_samples": a boolean indicating whether there is at least one sample with enough replicates to replace it. |
Source code in fedpydeseq2/core/deseq2_core/replace_outliers/substeps.py
LocReplaceCooksOutliers
Mixin to replace cooks outliers locally.
Source code in fedpydeseq2/core/deseq2_core/replace_outliers/substeps.py
loc_replace_cooks_outliers(data_from_opener, shared_state)
Replace outlier counts with imputed values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_from_opener
|
AnnData
|
AnnData returned by the opener. Not used. |
required |
shared_state
|
dict
|
A dictionary with a "trimmed_mean_normed_counts" key, containing the trimmed means to use to compute the imputed values. |
required |
Returns:
Type | Description |
---|---|
dict
|
A dictionary containing: - "loc_new_all_zero": a boolean array indicating which genes are now all-zero. |
Source code in fedpydeseq2/core/deseq2_core/replace_outliers/substeps.py
LocSetNewAllZerosAndGetFeatures
Mixin to set the new all zeros and return local features.
This Mixin implements the method to perform the transition towards the compute_rough_dispersions steps after refitting. It sets the new all zeros genes in the local AnnData and computes the local features to be shared to the aggregation node.
Methods:
Name | Description |
---|---|
local_set_new_all_zeros_get_features |
The method to set the new all zeros genes and compute the local features. |
Source code in fedpydeseq2/core/deseq2_core/replace_outliers/substeps.py
236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 |
|
local_set_new_all_zeros_get_features(data_from_opener, shared_state)
Set the new_all_zeros field and get the features.
This method is used to set the new_all_zeros field in the local_adata uns field. This is the set of genes that are all zero after outlier replacement.
It then restricts the refit_adata to the genes which are not all_zero.
Finally, it computes the local features to be shared via shared_state to the aggregation node.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_from_opener
|
AnnData
|
AnnData returned by the opener. Not used. |
required |
shared_state
|
dict
|
Shared state containing the "new_all_zeroes" key. |
required |
Returns:
Type | Description |
---|---|
dict
|
Local feature vector to be shared via shared_state to the aggregation node. |
Source code in fedpydeseq2/core/deseq2_core/replace_outliers/substeps.py
LocSetRefitAdata
Mixin to replace cooks outliers locally.
Source code in fedpydeseq2/core/deseq2_core/replace_outliers/substeps.py
loc_set_refit_adata(data_from_opener, shared_state)
Set a refit adata containing the counts of the genes to replace.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data_from_opener
|
AnnData
|
AnnData returned by the opener. Not used. |
required |
shared_state
|
dict
|
A dictionary with a "genes_to_replace" key, containing the list of genes for which to replace outlier values. |
required |