First Steps¶
The first step before running diffusion algorithms on your network using DiffuPy is to learn about the graph and data formats are supported. Next, you can find samples of input datasets and networks to run diffusion methods over.
Input Data¶
You can submit your dataset in any of the following formats:
CSV (.csv)
TSV (.tsv)
Please ensure that the dataset minimally has a column ‘Node’ containing node IDs. You can also optionally add the following columns to your dataset:
NodeType
LogFC *
p-value
- *
Log2 fold change
Input dataset examples¶
DiffuPath accepts several input formats which can be codified in different ways. See the diffusion scores summary for more details.
You can provide a dataset with a column ‘Node’ containing node IDs.
Node |
---|
A |
B |
C |
D |
2. You can also provide a dataset with a column ‘Node’ containing node IDs as well as a column ‘NodeType’, indicating the entity type of the node to run diffusion by entity type.
Node |
NodeType |
---|---|
A |
Gene |
B |
Gene |
C |
Metabolite |
D |
Gene |
3. You can also choose to provide a dataset with a column ‘Node’ containing node IDs as well as a column ‘logFC’ with their logFC. You may also add a ‘NodeType’ column to run diffusion by entity type.
Node |
LogFC |
---|---|
A |
4 |
B |
-1 |
C |
1.5 |
D |
3 |
4. Finally, you can provide a dataset with a column ‘Node’ containing node IDs, a column ‘logFC’ with their logFC and a column ‘p-value’ with adjusted p-values. You may also add a ‘NodeType’ column to run diffusion by entity type.
Node |
LogFC |
p-value |
---|---|---|
A |
4 |
0.03 |
B |
-1 |
0.05 |
C |
1.5 |
0.001 |
D |
3 |
0.07 |
See the sample datasets directory for example files.
Networks¶
If you would like to submit your own networks, please ensure they are in one of the following formats:
BEL (.bel)
CSV (.csv)
GML (.gml or .xml)
GraphML (.graphml or .xml)
Pickle (.pickle). BELGraph object from PyBEL 0.13.2
TSV (.tsv)
TXT (.txt)
Minimally, please ensure each of the following columns are included in the network file you submit:
Source
Target
Optionally, you can choose to add a third column, “Relation” in your network (as in the example below). If the relation between the Source and Target nodes is omitted, and/or if the directionality is ambiguous, either node can be assigned as the Source or Target.
Custom-network example¶
Source |
Target |
Relation |
---|---|---|
A |
B |
Increase |
B |
C |
Association |
A |
D |
Association |
You can also take a look at our sample networks folder for some examples.